AgStats 2023: applied stats and ecotourism in Indiana

6 minute read


I just got back from AgStats 2023 (Conference on Applied Statistics in Agriculture and Natural Resources) held at Purdue University in West Lafayette, Indiana, my first foray into an applied agricultural stats conference.

<i>Giving a talk at AgStats</i>
Giving a talk at AgStats

I had a great time at the conference. I liked the size and enthusiasm of the crowd, which was probably at most 60 or 70 people with a lot of students and postdocs. Clearly the conference organizers knew each other from way back — they made lots of corny jokes about how everyone was getting old and reminisced about line dancing through the years. It was wonderful to get to meet my ARS statistician colleague Kathy in person, and to see my old friend Andy Finley from my time at Michigan State. The conference was held in a building in the middle of a cornfield at Purdue’s ag experiment station a few miles away from the campus, which definitely set the tone.

I won’t mention every talk here, but I took notes on a few of them that I was especially excited about. The key themes that really piqued my interest were geospatial statistics, Bayesian approaches, stats education, and stats for long-term breeding trials.

Spatial stats: Andy Finley’s presentation on using nearest neighbor Gaussian process models with spatial data was very interesting. I had seen a similar talk back at MSU but it was a good reminder that it might be a useful approach for some of the datasets our scientists are working with. He compared and contrasted his method with the INLA approach and the reduced-rank approach, demonstrating that the nearest neighbor approach does much better in terms of approximating the full spatial model while remaining computationally feasible. Some of it is implemented in R packages called spBayes, spNNGP, and spOccupancy. But it turns out that for complicated real-world use cases like the one he presented using Forest Service FIA data, you can’t really use an off-the-shelf package. I think he coded everything in C. So that’s a bit of a steep barrier to entry. Predicting over spatial fields came up in other talks throughout the conference. For example Eryn Blagg (PhD student from Iowa State) talked about various approximated kriging approaches. That also gave me some ideas for things to look into later.

Stats ed: Sarah Manski, a PhD student from Michigan State, gave a talk on teaching a Bayesian stats and R workshop to Ethiopian ag scientists. Her short tutorial “You Can Learn R” might be a useful resource. It doesn’t teach R directly but aims to help beginners figure out what they should do if they want to learn R. A lot of ARS scientists would find it helpful!

Breeding trials: Another talk I found interesting was one given by Marcia Almeida de Macedo from Syngenta. She is working with some of their long-term breeding trial data, using it to do bootstrap simulations to figure out what is the optimal number of checks to include in a trial, and how many years checks and test varieties should be included before being gradually replaced. This was very interesting to me because some SEA scientists have similar long-term datasets that could be used in this way. Unfortunately she couldn’t give too many specifics because they’re guarding their trade secrets.

Cross-validation: Finally, I liked Rob Tempelman (also Michigan State)’s talk on blocked or grouped cross-validation. If you have a multilevel model, you have to think carefully about what level you want to do the cross-validation or out-of-sample predictions. You can get falsely optimistic results if you don’t do it correctly. This would be the case for block designed studies as well as anything with spatial or temporal dependence. A 2017 paper in Ecography by Roberts et al. is a useful reference. I also learned about the R function groupkfold() in the caret package which is a nice way to do that kind of cross-validation.

My own talk was about a dataset I am working on with an ARS scientist. I got some very good peer review from the statisticians and figured out a lot of things that I could improve about the model. It was intimidating to present my stat model to the likes of Hans-Peter Piepho, but he did not disappoint, making several points that directly led to changes I need to make. I am currently working on getting those changes done!

<i>Me and my dogg Neil</i>
Me and my dogg Neil

On Wednesday afternoon, I missed a few of the talks because I went to Purdue campus with a SEA scientist collaborator to meet with another collaborator there and discuss a project. That was a productive meeting, and it was also fun to wander around the campus and have my picture taken with the Neil Armstrong statue. Purdue hypes up their connection with Neil Armstrong — their branding slogan is even “Giant Leaps.” Pretty cool, though, I have to admit.

<i>Bay-breasted warbler</i>
Bay-breasted warbler

Aside from the conference, by far the best thing about my trip to West Lafayette was staying at a hotel that was literally across the road from Celery Bog. That is a natural area with a great mix of habitats including ponds, marshes, shrubby thickets, and woods. This makes it a magnet for migrating birds, and mid-May is right when the peak songbird migration hits the upper Midwest. I visited the bog five times in the just over three days I was there, and enjoyed some of the best birdwatching I have ever had. Over that time I saw twenty species of warbler including the beautiful bay-breasted, Blackburnian, mourning, magnolia, and redstart warblers. Not to mention you could not throw a rock without hitting a singing Baltimore Oriole or Rose-breasted Grosbeak (this is not condoning throwing rocks at birds)! I really leaned into the “Indiana ecotourism” aspect of the trip. Looking forward to more in the future!

Leave a Comment