MARINE ECOLOGY
  • Home
  • Blog
  • Research
    • Microplastics
    • Oyster Mortality
    • Tipping Points
  • CV and Publications
  • Contact Me

BLOG

New posts weekly!

Talk Data to Me

12/7/2023

0 Comments

 
Picture
This week I started the lengthy process of data analysis and summarizing my findings from the mesocosm experiment. While data analysis can be a quick step in the relative scope of an experiment, my PhD committee impressed upon me in our recent meeting the importance of slowing down and sitting with the data to build an effective narrative. So that's what I've been doing. While non-scientists are likely familiar with the data analysis features of Microsoft Excel and potentially the add-ons you can get for your analysis needs, most scientists use Excel for organization and then import the data from the Excel spreadsheet into a more robust analysis program. One of the reasons I enjoy working outside of Excel is that Excel has limited functions regarding some of the more complex analyses I need to run, and Excel can have trouble when the data are too large. For example, the file I have for the mesocosm data is split across 17 spreadsheets, with only one of the 17 as a metadata (notes) page. Additionally, the largest spreadsheet in the file has 165 columns of data, and Excel gets slow when I need to edit some of the larger sheets. Therefore, in my work I use one of three different data programs: (1) R, (2) PRIMER, or (3) Matlab, depending on my needs (R and PRIMER for data analysis, Matlab for maps and some figures, R and PRIMER for figures).

R is the very first robust data program I learned to use back in my second semester of college. I used it in my ecology laboratory class, but I was horrible with the software because we weren't taught how to code in my lecture course, while the other lecture professor incorporated it frequently into his class. So there was a large disparity across the students, which I think was apparent in the lab course. R, like some data programs, is a lot of typing out what you would like the program to do with the data and it will do it. Anything from summarizing the data and providing the mean value to running complex analyses and generating figures. Some researchers even use R for some image analysis methods they've developed. However, I have two issues with the program. First, error messages when R cannot run the code you've given it, appear quite vague. Part of that is likely that I don't understand the syntax it is using to describe the problems, but sometimes the language is quite unclear. Anytime I see an error without a clear reason, I go back through everything I typed to find out what caused the problem. This leads to the my second small complaint with R, which is that R is very much a coding program, where you have to type every item you would like R to accomplish. R does have some predictive capabilities, but it is a lot of typing, which gets frustrating if you type out lines of code and then an error happens. I do appreciate coding in R because it is quite a common data program, which means that there are great resources available on Google when I run into problems. R can also make some nice figures, like the one shown here from my mesocosm work. Notably, I have covered up a lot of the information on the figure so that my current analysis and results are not shown prior to review and publication. 

I use PRIMER in a situation where the data for my experiments do not follow a normal distribution even after transformation. There are assumptions we make as scientists when we analyze data, and one of the assumptions is often that the data follow a normal distribution (bell-curve). However, there are situations where this does not happen and we need to evaluate our data using an alternative method. PRIMER is great for dealing with alternative data analysis methods and it is more user friendly than other programs because it is a selective, rather than coding, program, where the user chooses what they would like the program to do to the data. PRIMER also makes nice figures, but because it uses pre-programmed click options, sometimes changes you'd like to make to figures are not possible.

Finally, Matlab is a program used often by mathematicians and oceanographers, and I learned about it when I took an oceanographic modeling course during my PhD. As I'm not working on any modeling components for my current research, I don't have much use for Matlab, though it is great at making maps of sampling sites since it has detailed geographic data and local, regional, and world maps that I can alter. 

A lot about data this week, and while it isn't the most exciting part of my work, data analysis helps establish the narrative for my research findings and manuscripts. Next week stay tuned for an exciting post about what's happening in the new year.

0 Comments



Leave a Reply.

    Categories

    All

    RSS Feed

Powered by Create your own unique website with customizable templates.
Photos from unukorno, Grace Courbis
  • Home
  • Blog
  • Research
    • Microplastics
    • Oyster Mortality
    • Tipping Points
  • CV and Publications
  • Contact Me