ReNewportProjectMinutes

From Norsemathology
Jump to navigation Jump to search

June 24th, 2024

Agenda:

  • DataBloom hour
  • Discuss forward progress with what's left to do:
    • Danny:
      • focus on HeatMap
    • Tyler:
      • implement PM2.5's "Newport Correction" (using PM2.5_cf, rather than PM2.5_atm).
    • Ethan:
      • implement the email feature
  • Our presentation is the week after the 5th of July; do we have a date/time?
  • No new features (after the heatmap).
    • Document, document, document: We need to turn this over in pretty good shape, so that DataBloom can move forward.
    • Let's not forget the data cleaning diagram for Josh
  • What sort of configuration would need to be done if someone were to adapt our software for their PurpleAir Monitors?

June 20th, 2024

Agenda:

  • Next meeting: DataBloom wants an hour
  • Discuss results of Tuesday's meeting
  • Talk about what's left to do
    • Danny:
      • To focus on HeatMap
    • Tyler:
      • To implement PM2.5's "Newport Correction" (using PM2.5_cf, rather than PM2.5_atm).
    • Ethan:
      • to implement the email feature
  • Can we agree to present once Ethan is back from Florida, and Danny back from Chicago?
    • week after the 5th of July; Nelum will set that up.
  • No new features (after the heatmap).
    • Document, document, document!
    • We need to turn this over in pretty good shape.
    • Let's not forget the data cleaning diagram for Josh
  • What sort of configuration would need to be done if someone were to adapt our software for their PurpleAir Monitors?
  • Andy agreed to mention "future work" with DataBloom, and that our students should not expect to work for free.

June 17th, 2024

Agenda:

  • Reminder: Tomorrow's meeting is at 10 am, not 11.
  • Tyler: the Dashboard is running on his own server. Tell us more!
  • Colocation data comparison results (between EPA and PurpleAir monitors in September, 2023)
  • Database conversion issues: Danny's csv bug....
  • Confidence score ideas
  • Tim's wishlist:
    • Transparency, transparency, transparency
    • Diagram of data cleaning process
  • Status of GroundWork data, and how will it be best incorporated by Tyler into the existing database?
  • Wiki on GitHub?
  • Danny and Ethan updates: Are we ready for tomorrow's demo?
  • Tim and Rachel:
    • Think about what we want to present; may be a google form survey, and had a limited response. Tyler will have a live version, so that they can try it out on their own.
    • Missing features? What
    • How is it working at the moment? Domain? Is it completely standalone?
    • How do we transition? Documentation...

June 13th, 2024

Agenda:

  • Tim will be with us for the first half hour. In particular we need to discuss
    • Might have a diagram illustrating the cleaning process.
    • Raspberry Pi to collect data, and then fire up the revised files....
    • Be as transparent as possible; we're not doing research level work; just explain exactly what we've seen.
  • Updates:
    • Danny: demo the dashboard, as it is.
    • Tyler (and Danny): discuss the need to move to a database, rather than to large csv files
    • Tyler: repeated header in several ab data files
    • Ethan:
      • Tell us about the RSS feed
      • We'll want some local resources as "useful links" for Newport people, e.g. GreenUmbrella, GroundWork, and acknowledgements (e.g. DataBloom, NKU Department of Mathematics and Statistics, CINSAM)
      • Any luck getting this on GitHub?
    • The two channels from PurpleAir monitors appear to be giving quite different readings in many cases. According to PurpleAir, there are lots of reasons that this might be the case. Spider webs, for example...!
      • The Confidence Score

        ael: I can find no definition of "the confidence score", and, although users have asked for it on their website, they do not appear to have responded....]

        "The Confidence Score is a measure of how confident PurpleAir is in the readings a sensor is reporting. That is, it reflects PurpleAir’s belief about the reliability and credibility of a sensor’s data. A higher confidence score indicates a greater level of confidence in the accuracy of the sensor’s data, while a lower score suggests a lower degree of confidence."

      • Sensor Maintenance
      • Replacement laser counters
      • Calibration of PurpleAir Monitors (and why we should use "alt").
      • PurpleAir sensors can be expected to last about two years...
      • What Do PurpleAir Sensors Measure and How Do They Work?
        • Channel Output Deviation
        • PurpleAir sensors, and the laser counters within them, are not perfect. The two channels inside a device will not always track strictly with each other. Sometimes, this deviation can be due to an issue with the sensor or its readings. However, this is not always the case. There is an accepted Maximum Consistency Error. According to the laser counter manufacturer, Plantower, the accepted range for both PMS5003 and PMS6003 laser counters is as follows:

          • ±10μg/m³ @ 0~100 μg/m³
          • ±10% @ 100~500 μg/m³

          Please note, these ranges are for raw PM2.5 readings, and not AQI levels.

        • "It should be noted that while PurpleAir sensors have been found to be highly precise in their PM2.5 and PM1 estimates, it has been demonstrated that they drastically underestimate PM10 pollution levels. As such, we do not recommend using the PM10 readings from PurpleAir devices for health-based decisions." ael: note that this (and the "consistency" remark earlier are about precision, and not accuracy.
        • PurpleAir PM2.5 U.S. Correction and Performance During Smoke Events 4/2020 (EPA document):

          "For the national data set of sensors collocated with regulatory-grade monitors, results show that PurpleAir sensors, when corrected, accurately report NowCast AQI categories 90% of the time as opposed to uncorrected PurpleAir data, which are accurate only 75% of the time. Testing the correction scheme for the research data set of wildland fire smoke events revealed that the corrected data compared closely with the reference monitors and produced similar NowCast AQI categories."

      • Discuss making the GitHub public, so that we can document with the Wiki....
      • Colocation data (Andy is working on this):
    • EPA's AQI calculation:
    • There is a meeting next Tuesday. We'll prep for it on Monday. This is a demo for the group, and Danny suggested sending around a survey after we're done with the demo.

    Resources

    CF=1 and ATM
    
    CF=1 and CF=ATM (appearing as _cf_1 and _atm in the PurpleAir API) are formulas used in Plantower laser counters. One of the things that these formulas do is estimate the density of the particles passing through the laser counters. With both the volume and density of the particles, the laser counters can generate a mass measurement (i.e. µg/m3).
    
    CF1 and ATM data are shown by default on the PurpleAir Map as follows:
    
        Indoor sensors* display CF1 data
        Outdoor sensors* display ATM data
    
    *Indoor and outdoor sensors, in this context, are determined by how the sensor is registered. Thus, a PurpleAir Classic, while typically being used outdoors, can be registered as an indoor sensor, and the data from the sensor will be CF1 data on the map.
    
        It should be noted that both the CF=1 and ATM algorithms are unknown to us; the manufacturer, Plantower, considers these formulas to be proprietary and hasn’t provided them to us. We follow their instructions for how to use the data, which comes directly out of the laser counters as CF=1 and ATM data.
    

    June 10th, 2024

    Agenda:

    Updates:

    • Danny: how goes getting real data into the demos?
    • Tyler: looks like there is only one channel for the weather measurements.....
    • The two channels from PurpleAir monitors appear to be giving quite different readings in many cases.
    • Nick's app (described in Weather and Air Quality Shiny App) has some nice features.
    • Air Quality impacts
    • This article, Superfast Microsoft AI is first to predict air pollution for the whole world, describes Aurora:

      AI researcher Paris Perdikaris at Microsoft Research AI for Science in Amsterdam and his colleagues found that Aurora could in less than a minute predict the levels of six major air pollutants worldwide: carbon monoxide, nitrogen oxide, nitrogen dioxide, sulfur dioxide, ozone and particulate matter. Its predictions span five days. It can do it “at orders of magnitude smaller computational cost” than a conventional model used by the Copernicus Atmosphere Monitoring Service at the ECMWF, which predicts global air-pollution levels, the team wrote in a preprint1 published on arXiv on 20 May.

      Reference: AURORA: A FOUNDATION MODEL OF THE ATMOSPHERE: Air quality is a key factor in non-communicable disease and therefore the health of humans, and is determined by concentrations of various gasses and aerosols in the atmosphere (World Health Organization, 2021). Accurately predicting global atmospheric composition (the distribution of trace gases and aerosols in the air) can aid mitigation of air pollution events.

    Resources discussed today:

    June 6th, 2024

    Agenda:

    Updates:

    • Danny has some new graphs; how goes the work of getting real data into the demos?
    • Tyler has posted new data (took two and a half keys! Thank goodness you have so many siblings!:). Do we have both channels for all data?
    • There are improvements in interacting with GitHub, yes?
    • DataBloom working document in conjunction with our project.


    June 3rd, 2024

    Agenda:

    • Update with Tim
      • Double check code -- another set of eyes....
      • Make sure that we can handle other kinds of data....
      • Screen shots to broader group
      • EPA data -- getting going?
      • Getting this going live
    • Review of Changes from
      • Danny
      • Ethan
      • Tyler

      Data and Analysis:

    • Tyler:
      • Thanks for making the data available! Time to play....
      • Work on getting both A and B sensor values, so that we can incorporate "confidence" (since we lack a lot of confidence in PurpleAir now).
      • Figure out how to get the GroundWork data into shape.
    • Danny:
      • Focus on bringing in real data for testing, including EPA data
      • Add multiple timelines and average for comparisions.
      • Get the thumbs-up/thumbs-down icon to rotate with confidence...:)
    • Andy:
      • Note: I'm using the data Tyler provided as a Zip a week or so ago.
      • Tyler: why are the time_stamps not given in order?
      • Simple regressions on 11 PurpleAir sensors (note: only using about 96% of the data, due to outliers).
        • humidity and pressure are significantly correlated with PM2.5 (or PM10.0, for that matter), with an of about 0.15:
          Linear Regression:        Estimate            SE              Prob                                                                                               
                                                                                                                                                                           
          Constant                 -417.589       (5.62323)           0.00000                                                                                              
          humidity                 0.221699       (3.173477E-3)       0.00000                                                                                              
          pressure                 0.421619       (5.587148E-3)       0.00000                                                                                              
                                                                                                                                                                           
          R Squared:               0.153543                                                                                                                                
          Sigma hat:               8.72065                                                                                                                                 
          Number of cases:             44620                                                                                                                               
          Degrees of freedom:          44617                                                                                                                               
          

          Notice that PM2.5 rises with both humidity and pressure.

        • humidity alone has an of roughly 0.045.
        • pressure alone has an of roughly 0.060.
        • temperature was not significantly correlated with PM2.5.
        • PM2.5 and PM10.0 are significantly correlated with with an of about 0.98: :

          Image 6-3-24 at 9.06 AM.jpeg


      • Wild values of PM2.5 and PM10.0 from sensor 184711, e.g: 1713726000,184711,26.6,64.867,1002.752,5.817500000000001,6.313000000000001 1715500800,184711,51.367,65.6,994.98,2315.299,2315.9919999999997 1715508000,184711,53.6,63.8,995.847,2314.5375,2315.3104999999996 1701010800,184711,48.667,55.633,995.309,25.297,26.908 1715500800 is Sunday, May 12, 2024 4:00:00 AM GMT-04:00 DST and 1715508000 is Sunday, May 12, 2024 6:00:00 AM GMT-04:00 DST
      • Wild values of pressure from sensor 184711, e.g: 1710482400,184671,4.067,56.933,999.383,1.6575,2.0700000000000003 1710486000,184671,4.333,58.733,1001.617,1.6755,2.1295 1710504000,184671,28.133,59.966,1521.564,14.5125,15.3475 1704589200,184671,66.867,44.833,987.955,39.079,45.9375 1702378800,184671,60.3,36.267,1008.583,16.448999999999998,18.6815 1715835600,184671,0.0,71.867,626.308,23.0035,23.924999999999997 1715853600,184671,0.0,67.1,623.128,32.4995,36.3705 1715893200,184671,0.0,92.8,641.13,10.332,10.724 1710064800,184671,45.633,40.733,995.157,1.6075,1.8935 1710504000 is Friday, March 15, 2024 8:00:00 AM GMT-04:00 DST

        Nothing going on then: https://www.timeanddate.com/weather/usa/cincinnati/historic?month=3&year=2024

      • https://www.epa.gov/outdoor-air-quality-data/air-data-daily-air-quality-tracker
      • https://www.epochconverter.com/

    Upcoming Meetings:

    • June 6th, regular meeting but at 12:00.
    • Meeting with Greg at 3:00, for AI modeling tips.
    • June 13th, 11:00 with Tim and Rachel.

    May 30th, 2024

    Agenda:

    • We plan a "deep dive" into the code.
    • We need to request a meeting with Greg

    Today's Issues

    • Tyler:
      • Noted an error in the PurpleAir data -- what's that all about?
      • Where is the cleaned data being kept, as well as scripts, documentation, etc on how it was created? (MetaData)
      • Shouldn't data be kept on our GitHub? We need that to begin the analysis....
    • Danny:
      • I'm getting an error: "'to' must be a finite number" (looks like it may be a -Inf in the code): Warning in max(dataset$last_seen) : no non-missing arguments to max; returning -Inf
    • Andy:

    May 28th, 2024

    Agenda:

    We have a small group meeting (with DataBloom folks) to start at at 11:00, followed by a "deep dive" into the code.

    Introductions

    Ethan Davis (davisethan835@gmail.com) is a student focused on the website, and especially on getting the help together. His official title has "Wizard" in it: "Wizard of Wowness"?

    DataBloom focus:

    Danny will provide an update on the dashboard, demonstrating alignment more along the lines Tim indicated in an email (with nods to accessibility). We'll follow with their feedback.

    Our focus today:

    Nelum had suggested that we do a "deep dive" into the code so far. The GitHub site is [https://github.com/Newport-Air-Quality-Dashboard/Resources

    Danny was a little late, due to a flat tire, so Tim had to leave before we got to "the show". So Tyler led us off with some comments about his progress so far.

    • Tyler:

      • bash shell script converts EPA data to PurpleAir -- nice stuff!
      • Crontab to run these scripts?
      • All of this in Raspberry Pi? We're not sure.
      • Makes sense to make things consistent with PurpleAir format? So we hope that the GroundWork data will be easy to adapt to this format.
      • Issue with lat/long for PurpleAir, 35 sensors, 5 months or so.... Seems like a one-off problem, because moving forward we'll be updating the PurpleAir data with data that includes lat/longs.
      • Tyler is leaning away from a database.
      • Where is this cleaned data?

    • Tim (needed to leave, so I asked him for any issues on his end):

      • How does it look live?
      • Make sure that everything is working together (e.g. server for database, shiny server, etc.)
      • Rachel is out of office -- back on 6/7

    • Server?

      • Raspberry Pi

    • Ethan:

    • Danny:

      • Gave a run-through with Rachel, to elicit comments
      • Discussion of the value of calendar plots -- more for long term visual, and for EDA -- seasonal effects, etc. Perhaps notice weekly trends (e.g. weekends more or less)
      • More on how to handle comparison plots; Andy noted that PurpleAir's dashboard allows one to see multiple (each clicked point their timeseries joins the others). And a mouse-over gives more info on that time series.
      • Danny plans to spend more time on the IDW heatmap.

    • Rachel:
      • Asked for screen shots to stay up to date on progress, and make comments (and know what to talk about at the next meeting, Monday).

    May 23rd, 2024

    Agenda:

    Andy had to find his lost dog. So the meeting was rather impromptu, after we finally got together. Apologies for not taking good notes about what each person was to do, etc. My mind was elsewhere, and my Zoom connection was bad. But, upon reviewing the transcript, I have a few things:

    • Andy:
      • Wondered if we could have different themes for different users (e.g. "vanilla user" versus "data nerd"; or even mobile theme....)
      • Josh is going to be easy to please; and I think Data Bloom has a better idea of what's really important in terms of sharing info with ordinary users.
      • Wondered about using the wiki on GitHub, but it's disabled because we've got it set to a private organization.
      • If the PurpleAir data is really as cheap as we think (e.g. pennies a day) then we need to grab more, rather than less. Josh has offered the budget to do this.
      • Discussed isotropic models of autocorrelation (e.g. do we trust all sensors equally at the same distances in all directions?) versus anisotropic models (e.g. there's a wind from the west that tells us that we can trust the sensors along the direction of the wind more than those perpendicular to the wind direction). We need to do some Exploratory Data Analysis (EDA) to discern the effect).
    • Danny:
      • Add on a "data specialist" sidebar as possibility (what, you don't like the term "data nerd"?:)
      • First priority should be to meet the needs of ReNewport
      • Can get started on a heatmap, using Inverse Distance Weighting (IDW) as a start; Andy mentioned "nearest neighbors" as another simple option (e.g. use an average of the four nearest sensors).
      • Danny appears to want to keep an eye on River Metals Recycling, to see if he's noticed some action from that site downwind.
    • Tyler:
      • Provided a link to the historical PurpleAir and EPA data (Andy has downloaded, and is at the dawn of investigation. Nelum too?)
        • PurpleAir: each monitor as a csv, with time_stamp,sensor_index,humidity,temperature,pressure,pm2.5_atm,pm10.0_atm
        • EPA: data csv files, by month, each a collection of monitors reporting (different things) by hour. NKU's produced PM2.5 and Ozone, for example; others report PM10, NO2, etc.
      • Will look continue looking into the advantages of using SQLite verses going with an R data frame; DataBloom seems to be leaning away from the database (but Andy at least is leaning into the database). A database
        • in part helps avoid a proliferation of data files, and
        • helps enforce systematization of formats -- we want to force those submitting data (e.g. GroundWork) to think about putting variables on a common footing.

    May 20th, 2024

    Agenda:

    We have a group meeting at 11:00, followed by a meeting with DataBloom folks at noon. The group meeting is being run by Kristy Hopfensperger, and includes a few minutes for us to provide a demo of what has been accomplished so far, in terms of

    • dashboard construction
    • database design issues
    • server issues
    • PurpleAir interaction.

    Danny will provide an approximately 10 minute intro to the dashboard, followed by Tyler with a few minutes about database, followed by Q&A.

    Then we will have a discussion with DataBloom folks to discuss next steps, and database issues

    May 16th, 2024

    Agenda:

    Today we want to check in on

    1. database and code base progress
    2. any server progress
    3. Dashboard progress
    4. discuss a presentation on Monday for the larger group (including ReNewport and GroundWork folks, as well as Kristy).

    Last time

    Along with our partners at DataBloom we came to the conclusion that R/Shiny/Leaflet would be the best option going forward, with Grafana being an aspirational target for features.

    In terms of responsibilities and todos, we decided that

    • Tyler
      • leads Database (thinking SQlite, perhaps)
      • sets up the GitHub repository (Different repos, for code, statistical methods, etc.)
      • checks with CS on server access.
      • looks at EPA data
    • Danny
      • leads R Shiny dashboard development
      • works on getting "PurpleAir Equivalence" for dashboard
    • DataBloom will be responsible for cleaning GroundWork data. Then we'll have to figure out how to integrate this into our database (we'll focus on integrating PurpleAir and EPA data).

    Upcoming meetings

    • 1:30 pm: Josh's walking tour of Newport
    • Monday, May 20th, 11am: Big group Dashboard meeting, hosted by Kristy

      We'll need to make a 5-10 minute presentation on our progress, and gauge reactions.

      All big group meetings will be at this Zoom link: https://nku.zoom.us/j/9534110091

      (Next big meeting: Tuesday, June 18th, 10 am -- note: hour earlier than usual)

    • Monday, May 20th, noon: Following our meeting on Monday, we'll stick around and chat with DataBloom
    • Thursday, May 23rd: 1:00 pm meeting with Greg Lemmon

    Todos

    Nelum

    Andy

    Tyler

    Danny

    May 13th, 2024

    Agenda:

    The signal corps has been going gangbusters, and we've got a most important decision on our agenda: to decide on the tools with which we will create the dashboard.

    Visitors/New folks

    • Welcome Rachel and Tim: DataBloom representatives
    • Welcome Ethan? Not yet! Next week, hopefully....

    Today's business: choosing the Dashboard environment

    As of our last meeting, we had two primary competitors for the tool creation (although Tyler was interested in looking into Tableau).

    • R Shiny/Leaflet Dashboard (Danny):

      • What are its strengths?
      • What are its drawbacks?
      • Is it flexible? Easy to add new features?

      Danny's thoughts:

      • It is really flexible;
      • Pretty good fake of purpleAir's dashboard so far:
      • Ethan can help on the help and home side;
      • Nothing it can't really do;
      • Browser dependence
      • Leaflet, javascript, R, css; bit of a Frankenstein? But 95% R, per Tim's comment

    • Grafana dashboard (Tyler):

      • What are its strengths?
      • What are its drawbacks?
      • Is it flexible? Easy to add new features?
      • How well does it play with R?

      Tyler's thoughts:

      • Can't isolate portions of the display; just collapse menus
      • Already has the eye candy that Danny worked so hard to fake
      • Easily mobilized
      • When it comes to making dashboard public... must reexecute queries, and perhaps other issues
      • Can pull data from all over different databases.

    • Tableau thoughts?

    • Beyond the dashboard software itself,

      • Scraping the PurpleAir data
        • API
        • Costs
        • Storing the data to the/a database

      • Database
      • AQI: Danny's conversion issue

    • This is the fork in the road:

      • if we're going away from R Shiny, and into "unknown territory", then we need to make the case (and consider more time deciding on tools);
      • if we're going with R Shiny, then we're ready to go.

    • If ready to go, then we need to divy up new tasks for the next week:

      • Database: Tyler lead?
      • R Shiny: Danny lead?
      • Mobile: Ethan lead?
      • GitHub repository?
        • Tyler will set up
        • Different repos, for code, statistical methods, etc.

    • Everyone: documentation! :) Let's document within code, and then create a Markdown doc on the github site.

      • Mine the Signal stream so far for links and references.
      • Mine the Signal stream for ideas and observations.

New Business

Upcoming meetings:

  • This Thursday:
    • 11am: ReNewport group meeting

    • 1pm: Josh's walking tour? (Do I have that right?)

  • Head's up: Monday, May 20th, 11am: Big group Dashboard meeting, hosted by Kristy

    We'll need to make a 5-10 minute presentation on our progress, and gauge reactions.

    All big group meetings will be at this Zoom link: https://nku.zoom.us/j/9534110091

    (Next big meeting: Tuesday, June 18th, 10 am -- note: hour earlier than usual)

  • May 23rd: 1:00 pm meeting with Greg Lemmon

Todos

Nelum Andy

  • Contact Josh about 1:30 tour
  • Contact Kristy, check on server and get on agenda with demo

Tyler

  • GitHub organization
  • Collect different data sources;
  • CS about server
  • Type of database; SQlite

Danny

  • Finalize to confident PurpleAir equivalent
  • Plot options
  • Better plot look
  • Wait on plot options; shell of options
  • Add legend, calendar
  • Compare locations
  • Get Andy contact info for Ethan

May 9th, 2024

Agenda:

What did we learn last time?

  • Data sources information
    • AQ Mesh, EPI, Purple Air,
  • Rachel's document was really useful
  • Using her dashboard documents for model
  • DataBloom info was really useful
  • Learned why they're doing what they're doing
  • Lots of resources they provided that can be leveraged
  • Thoughts of creating a library with a better index for R resources.
  • Server location and issues

Communication

  • Wiki accounts yet?
  • Signal messaging
  • Running Zoom meetings (Nelum take over? It's dicey to run things from Canada!)
  • Next meeting, Monday, 11am, and we need to invite Tim and Rachel

AI and Deep Learning Models

  • Discussion with Greg Lemmon (of Eureka Ranch) re AI
  • Have we got a date and time?
    • May 23rd, 1:00-1:30

Tour with Josh Tunning , Director of ReNewport

  • -- learn more about the monitors, where they're placed, how they function (or don't), discuss possible other placements, etc.
  • "I would love to meet the students and showcase the monitors! Let me know a time and date that works best and we can make it happen, maybe even turn it into a short walking tour showcasing some of the environmental concerns the neighborhood has!"

Progress in the competition so far:

  • Document progress. Let's start a manuscript....
  • Tyler and Grafana
    • Free, open source
    • Easy to use
    • Blocks, allowing us to add time series
    • Better for quick and dirty?
    • Will look into other essential features, pop-up info, legends, how the database is accessed and when
  • Danny and Leaflet
    • Shiny Dashboards:
      • Very HTMLish....
      • It's important to look good, and work fast.
      • 3000 point issue; maps flex well, but not points.
      • wikidata.org database?
      • Mobile issues....
  • A collection of PurpleAir data for our locations and for March 16th, 2023
  • Nelum's R visualization

May 7th, 2024

Meeting with DataBloom folks (Tim NeCamp and Rachel Greenlee), with Nelum, Tyler and Danny.

The original request:

My email (annotated):

     As I mentioned on the call yesterday, we are going to be getting our feet solidly on the ground in that first week, after our semester ends, and will then set off in the most promising direction(s) we identify. However you all can help us in that process, the better!

      I see us sharing resources actively, i.e

  • moving between websites,
  • providing or receiving access to data,
  • discussing strategy, etc.

We'd especially like to learn how we can support your efforts – as Kristy said, we don't want to duplicate effort, but want to provide valuable assistance (perhaps especially in the modeling vein, but also in the dashboard construction itself). I am imagining that the four of us will be in a single room on campus, with a couple of us actively on-line, downloading data, etc.

Tim's response:

We can be available that whole time period and am happy to make it into a working session as you have described. We can start off the meeting talking through where we are with the project, and the various data sources we have access/will access into the future, and then go from there.

Dashboard work


Agenda

Andy:

  • Thanks for making time
  • Josh Tunning of ReNewport may be joining us
  • Intros
  • Our team's project: build a dashboard, and fill it with cool stuff!:)
    • But we don't want to duplicate effort,
    • step on toes,
    • etc.
    • We want to be a force for good.
  • Today we'd like to hear from you:
    • The teams working together -- how do you see the field of agencies?
    • What's been done so far.
    • How you see us fitting in.
    • How might you be able to help us do a better job?

Tim:

Some Takeaways

  • Rachel will join us Monday at 11, to see how we're doing. Hopefully we'll have made a choice by then as per the software tools we're going to use.
  • The DataBloom folk lean R and R Shiny; from the standpoint of continuing to support them (and make them comfortable with maintenance), I lean in that direction. If we decide to venture away from that, we should take a little more time, and make the case to them.
  • Tyler brought a focus on databases into the discussion. If we go with Grafana, we may need to figure out a way to interface SQL with R, and then allow Grafana to direct the operation and display the data without doing the analysis per se.
  • Danny suggested a bit of a dashboard competition:
    • Grafana versus Leaflet (Tyler the former, Danny the latter)


Working Group

Kristy Hopfensperger is leading this effort, between GroundWork, DataBloom, NKU, ReNewport, etc. etc.?

May 6th, 2024

End of project Priorities

  • working dashboard
  • Tyler -- diary of reflections on educational impact
  • Heather Bullen - August 27th
  • A talk at MathFest, August 7-10
  • Have fun!

Dashboard choice

  • Grafana -- recommended by Friend of Tyler
  • Shiny Dashboard

Dashboard objectives:

  • expandability
  • good foundation
  • maintainability
  • interface with R easily
  • looks good, maps, etc.
  • easy to use; intuitive
  • customizable
  • Minimum_ reproduce Purple Air's dashboard

Dashboard features

  • Calendar plot
  • Legends and scales
  • choropleth maps
  • dials, buttons, etc. -- how should it look? Theme-able
  • time series
  • comparison plots
  • historical
  • provide data?
  • colors versus numbers; is a there a standard we should follow?
  • accessibility; plays nicely with screen readers
  • educational features
    • glossary
    • help
    • background
  • Alerts
    • email, text device notifications
  • documentation of methods (e.g. extrapolations, interpolation)


Modeling

  • Do it in R
  • shifting monitors
    • cross-validation,
    • deep learning?
    • model updating
  • different types of monitors
  • identifying spoiled meters
  • How do we check monitors gone bad?

Todo:

  • Tyler:
    • Explore R, dashboard explore shiny and Grafana
    • Explore data (prepare some questions to be answered, and then explore after tomorrow's meeting)
  • Visit the monitor locations (EPA on campus, Purple Airs; notes, photos)
  • Nelum: R code for basic functionality
  • Andy:
    • reach out to Josh for visit
    • demo API for Purple Air and Rcode
  • Danny: will have fully functional dashboard by Thursday

Schedule:

  • Tyler: 9-5, Monday and Thursday; four hours on Tuesday
  • Nelum: 11-5, MR
  • Meetings at 11 on MR

Next meetings is Tomorrow at 11, with DataBloom, regarding data collection.