Aidan Neeson's Research Notebook #15
Replies: 30 comments
-
Summary of September 1st - 8thWeek of undocumented research notes to get caught up in my notebook. Sept 1stOn the first group meeting we discussed initial project ideas. Mine dealt with creating a tool to perform cost benefit analysis and trade-off assessment of switching fossil fuel powerplants into renewable energy powerplants. This constituted collecting data about fossil fuel power plants, "creating" a renewable energy array to match that powerplant, and then performing the assessment. Feedback indicated I was heading in the right direction, so I continued my research in this area, collecting articles and potentially useful data sets. Sept 6th and 8thFirst one-on-one meeting with Professor Kapfhammer (GK) in which we discussed my initial project idea. Important topics of discussion included the experimental component of my project, as well as the data necessary to have to complete my project. By the end of this meeting, my goals moving forward were clear. I needed to find the right data, which ended up being harder than was expected. On the 8th, I had some reliable data, but I spent the group session without GK brainstorming and finding more data. |
Beta Was this translation helpful? Give feedback.
-
Week 3New idea development, and a new path forward. Sept 13thOne-on-one meeting with GK. The topic of discussion was again about data, but more in depth on the experiment. My initial project idea's experiment was not very clear. It wanted to be data analysis, but also a user study, and neither of those meshed well with what I had in mind. We managed to brainstorm a more solid and clear idea that set up a clear path forward that included what data I should be looking for and the experiment that will be ran. My project idea moving forward is now: Using labeled data on locational and environmental factors, create, train, test, and evaluate the effectiveness of different machine learning models and algorithms to see which is better at predicting the viability of replacing fossil fuel powerplants with renewables. Sept 15thGroup meeting: Discussions about each others projects, development of pitches, understanding feasibility , and identifying similarities between group members projects. Big takeaways included finding the right data, and building the correct artifact that demonstrates feasibility effectively and clearly. TODOs 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 4Compiling data; Investigating MLMs; Start creating artifact Sept 18thOne-on-one with GK gave me more direction and confidence in my progress. Moving forward, the focus is to compile data together correctly as well as to get started using MLMs, but it is important to start small. Simpler processes, like using Python with Scikit Learn, will be the start. Another important topic of discussion was signifying the difference between my work and the work of NREL's PVWatts API. One of these is my work also including wind data, which NREL does not do, as well as me using different MLM's or machine learning approaches in general. Sept 20thOne-on-one with GK supplied me with a more defined project roadmap. With all of my data secured, and a pathway in place for it to be "cleaned up" and ready for passing into models, I need to start looking into which will yield the best outcomes. Some topics of importance were: Linear Regressions and simpler models being good place to start; optimizing the predictive model by not over or under fitting, and splitting up the data accordingly; eventually looking to k-fold cross validation techniques to really affirm good predictive models. Sept 22ndGroup meeting had us discuss what we have built so far. In discussing, some questions came up regarding how I would combat the issues I was running into with fitting my data into a linear regression. Clustering was pointed to as the easiest, short term, and potentially long term, way to ease my struggles. Looking into that is a good next move. TODOs 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 5Exploring data and performing cluster analysis Sept 27thOne-on-one with GK gave me more resources to look into regarding cluster analysis. Some of these being new algorithms; hierarchical, agglomerative, random-based. Others being evaluation metrics like the cophenetic correlation coefficient. Also discussed were looking into further exploring dimensionality, and how to get more data into the clusters. My demo was given more direction, and instead of doing a small-scale replica of my project, exploratory data analysis (EDA) was discussed. Doing this would prove feasibility by showing that I fully understand my data as well as demonstrate it is fit to use for my project. Sept 29thBlue and Gold Weekend Alumni PanelDuring the event I touched based with many of my peers and asked them about their research projects. We discussed project ideas, current progress, and future plans. We gave each other feedback and insights about various things related to our work. I met many alumni, as well as event coordinators. A big take-away that I got from having conversations with these individuals was the importance of challenging yourself, but also being kind to yourself at the same time. As well as this, having a strong support group of close friends and peers also seems to help a lot, not only with personal issues, but also with forming connections professionally. TODOs 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 6Finalizing EDA and organizing a pitch; clean up repository; explore clustering further in depth Oct 2ndOne-on-one with GK finalized the structure of my artifact, its repo, and the direction for my pitch. Also, more information was gained in terms of moving forward with my project after my pitch. Priorities for the next two weeks include:
When it comes to future visions for my project, a step in the right direction seems to indicate using an application like Quarto to host my code. This is because my tool/code needs to be interacted with, and this approach allows for more interesting interactions as well as other features like testing of this code. Oct 4thOne-on-one with GK gave my documentation more direction. I was struggling to find the right voice to use in my documentation, but we brainstormed together all of the options and what all of them mean. I am choosing to go for a voice that assumes the users of my tool are akin to those involved in the CMPSC 600 course. This makes the most sense to me as this is just a prototype, so catering to the masses is not an important factor right now. Also, a dependency management question was answered, and it seems in the future I will need to look into tooling Oct 7thGroup meeting was focused on feasibility. For my project, the "claim to fame" in terms of demonstrating feasibility was the EDA that I have done. Being able to validate my data, outline all of the metrics, and showcase the relationships within, serves to instill confidence that my data is suitable for my project. On top of this, using that data and experimenting with some clustering models showcases that going further with more in depth models, and evaluating their performance and accuracy is not out of reach. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 7Wrapping up the artifact Oct 11thI was sick today, so I was not able to meet with GK, but I set goals for the next three days that essentially encompasses finishing up work on the artifact and tying up any loose ends. Some of these tasks include polishing the GitHub actions workflow and bolster it, finish documenting my Jupyter notebooks, write a full and comprehensive README that include details about my project, as well as instructions to run my artifact, and if any extra time is available, perform a linear regression on some of my newly generated data in a notebook. Oct 12thThe GitHub actions workflow is fairly polished. I am able to use Ruff in notebooks, albeit it is an experimental feature, so some tweaks had to be made to the tool for it to lint in the desired way. I have looked into running the notebooks in GitHub actions, but the CLI commands used to perform this seem to be a bit finicky, so a little more work needs to be done for this. The README is nearly finished, I just need to include the steps to run the artifact, which will be a quick task. The notebooks are also nearly fully documented. I have been doing this alongside mapping out the steps for the README as I am treating the notebooks almost like a storybook in the way they flow. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 8Creating a presentation Oct 16thMet with GK and discussed an outline that I created for my presentation. All of the information I had in it was good, but some important things were missing. One of those being an explaination of my evaluation methods, or future plans for evaluating my work. We discussed k-fold cross validation as a good way to verify the results of my cluster analysis, as well as other ML models that will be used. Oct 18thMet with GK again to discuss my demo. We decided that it would be best to record a video as I will only need to showcase specific notebooks, but may still have to run a lot of them, so cutting this process out with editing will help to shorten my demo, to ensure I can present everything in the given time. We also talked about what to include in the demo and we landed on specifically highlighting a few things:
Oct 20thGroup meeting today had us showcase what we had ready to be presented. I restarted so I did not have much, but I got helpful feedback from my peers to ensure my presentation is top-notch. I will work on finishing up the slides and incorporating this feedback as I go. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 9Presenting; gathering literature Oct 23rdHad a brief discussion with my second reader (Prof. Luman) about what is to be expected from us in the weeks of presentations. We talked about working on the part of my research that is lacking the most, whatever that may be. As they did not know much about my project, we discussed it and they gave valuable feedback in almost all areas of my research. Moving forward I will take all of what was given into consideration. Oct 25thMet with GK to discuss what I will be working on moving forward. I decided the area of the research that is most lacking is the literature, so this will be my focus for the next few weeks. Some key areas to look into were identified:
Oct 27thI presented today, and watched a few of my groups presentations as well. I think it went well; my group members presentations were very good. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 10Gathering literature Nov 1stHad a meeting with GK about some of the articles that I gathered as well as the struggles I was facing while looking into articles. We talked about what I should be looking for in a paper, and that not every source has to be a "silver bullet" paper. It was made clear that it is important to get a variety of reliable sources that help to support specific points. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 11More resource gathering Nov 6thAnother meeting with GK to discuss some of the sources that I found. I made it apparent that I found a good source that points to a lot of other valuable research that I can use for motivation purposes, specifically when writing my introduction. It was agreed that it was a good source and offered a lot in terms of usefulness, but it was made apparent that more work needs to be done to validate sources that I use. GK brought it to my attention that the review period for the journal was very short, which is somewhat of a huge red flag. This is something I will need to watch for moving forward. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 12Starting to write Nov 15thQuick meeting with GK to discuss the process of starting to write. I feel confident that I can at the very least start to outline my writing for my first two chapters. We discussed what these would look like, as well as how I should go about beginning my writing, He said it would be very valuable to do all of the work in the thesis repository, but we do not have access to it yet. Until we get it, it was advised to just stick to outlining the chapters and laying out ideas, as it would be more work to transfer writing from another platform over to the repository. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 14Continue polishing outline and start writing No Meeting This WeekComing out of break, I felt a bit unmotivated and did not get enough work done to justify a meeting for this week. I outlined my introduction chapter and began writing some bits that I felt confident I could include. I noticed that there was a gap in my resources relating to ethical issues, so I will start looking there. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 15Finish first chapter and begin second chapter Dec 8thIn a group meeting we discussed the work that we have completed regarding our first chapters, as well as the state of our artifact. I have a completed first chapter that needs minor updates, but I have not touched my artifact as I have not seen it as important as writing. We also talked a lot about ethics, where each member talked about the implications of there work, and helped to brainstorm implications for other if they were struggling. The meeting made me identify more areas for improvement for my introduction, as well as got me thinking about beginning my second chapter. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 16Write second chapter Dec 11thMeeting with GK helped me to understand what goes into a related work chapter. In my case, it seems best to treat the chapter as both a background and literature review, as there is not enough directly related work to constitute a chapter dedicated to it. There are enough other relative unknowns to support introducing those concepts in this chapter, however. Many of these concepts that need to be introduce take the form of my research, which can be used to advocate for my research as filling in the gaps left by the other related work, so it all seems to work together. In the end, I feel a lot more confident about writing this chapter, and with some minor tweaks to my outline, I should be good to go. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Hello @AidanNeeson, you have submitted an exceptional research notebook that clearly meets and exceeds the baseline requirements. Overall, very good! Please carry this quality of your work into the upcoming semester of your senior thesis research. |
Beta Was this translation helpful? Give feedback.
-
610 Research Notes ✏️ |
Beta Was this translation helpful? Give feedback.
-
Week 1Layout roadmap for second semester; discuss and implement feedback on chapters 1 & 2 Jan 19thFirst group meeting of the semester, now with 580 students as well. It is interesting to see them in class with us and I hope they can get a lot of value out of sitting in on our discussions. We talked about our ideas, reiterating a pitch to see where they stand. My project has developed a little, now including more social factors, but I am afraid that it is starting to sway more towards that. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 2Discuss and implement feedback on chapters 1 & 2; reintroduce artifact and enhance it Jan 24thOne-on-one with GK. We met to discuss my first and second chapters. We discussed the idea of references potentially needing to be more narrowly scoped in some places. This would help to support my main points even further by being less general. The big take away from the meeting was the direction of the project. My research started to shift towards implying my work would lead to a user study, but this was never the intention. We discussed methods and steps to take to ensure my research stayed on track with my original idea. Jan 26thGroup meeting. Today we reintroduced our artifacts and demonstrated them to our group. We talked about things that may have changed, or been updated over break. The main focus was to roadmap what remains to be implemented. This discussion laid out the ground work for the things I need to do with my artifact moving forward. I was to move away from a collection of Jupyter notebooks, and hopefully be able to host it all in a self-contained Quarto website. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 3Work on artifact further; start on chapter 3 Jan 31stMeeting with GK consisted of discussing how to talk about the ML algorithms being used, and where to get the information from. The Journal of Machine Learning Research was discussed as a good source to get direct information about models, however it would be better to find more directly linked sources that are within the project's field. Also, we discussed how to start the transition process for my artifact, as well as what other features could potentially be added to it if time allows. We talked about how Quarto makes the transition really easy, since notebooks are the backbone of a Quarto project. We also discussed using shiny live to add some interaction with the models, but this is extra content and not a priority. Feb 2ndGroup meeting has us discuss our methods. We outlined how our methods section would be structured, but more importantly, broke them down into their most important pieces and pitched the idea to the rest of the group in an attempt to gauge if others could grasp the entirety of the project based on the explanation. The point, I believe, is that if someone cannot understand your project well, then your methods, or methods description is flawed. Mine seemed to be well understood by the group, which is a good sign. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 4further develop artifact; continue/finish chapter 3 Feb 7thMeeting with GK today and we discussed progress on the artifact. I gave a short demonstration for how the website is coming along, making note of how I was using Quarto and choosing to organize the information. I also brought up an issue with a particular feature not working, which involves having the linked notebooks be represented in Google Colab. This idea may have to be scrapped, as doing this manually may be infeasible given the time constraints. Feb 9thIn class meeting today had us shadowing the 580 student's project and giving feedback. More specifically, we listened in on their methodologies and gave verbal feedback about their descriptions. All of them were okay, but they all were missing crucial pieces that define their project and how it was constructed. The overall consensus is to exhaustively document what went into making the project. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 5Continue artifact development; finish chapter 3; begin experiments Feb 14thMeeting with GK today and we discussed the experiment I will be conducting. We had previously decided on k-fold cross-validation, and the consensus was that this should still be used. We discussed metrics that will be used for scoring. I outlined R-squared, RMSE, MSE, and MAE as the candidates currently. It seems like all of them may not be necessary, and it is important to make note of how they are distinct and what each score measures. Some other avenues covered were statistical and practical significance, which can be measured through p-values and effect-size. Some discussion also went into auto ml methods which can help to streamline the process significantly and make the methods simpler as well. The main tool talked about was Sapient ML. I will look into this, but I am not sure I will take it on, as I do not want to make drastic changes to my methods if it is not necessary. Feb 16thGroup meeting today had us showcasing our "final" computational artifacts. My website was not complete, as the content formatting takes a lot of time, but all of my models and experimental tools were working, so I focused my showcase on these, while making note that the goal is to put all of the information onto the website here soon. It seemed to be received well and the people I showed it to did not have any negative feedback. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 6Examine experimental results; refactor datasets; plan chapter 4 Feb 19thTalked with Professor Luman about my results, as I was not able to meet with GK this week one-on-one. In short, the results were very concerning as they were reporting perfect models, which is super suspicious. Prof. Luman and I managed to determine that the dataset was extremely biased, as We also discussed what this could turn into when it comes to the final chapter. We talked about noting how the publicly available data is insufficient/infeasible to work with, and too much work as is reasonable must be conducted to utilize it correctly. The idea is to note how this project could signify that changes need to be made in the field of data to allow for research like this to be conducted. Feb 23rdIn class meeting had us discussing our experiments and experimental design. I talked about what made up my experiments, as well as the results. I noted how the random forest is working very well, while the other models are not performing to standards. I discussed k-fold cross-validation as the main strategy and why it is important and what it seeks to accomplish. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 7Continue refactoring datasets; re-run experiments; write chapter 4 Feb 27thI met with JJ to gain more insights into my results and how I should be interpreting them. We also discussed the experiment, in this case k-fold cross-validation, as well as metrics. I was having trouble talking about my metrics, and I asked for guidance on how to interpret them. I was given advice that maybe the metrics I were using do not give the best insights, and others/less can be used to better define my results and refrain from repeating information. I also met with GK today and discussed progress on the refactoring. I showcased my new wind data and talked about how this approach should in theory remove the bias, so long as everything tracks and I can articulate this in my new results. I also discussed newly chosen metrics, those being R-squared, RMSE, and MAPE, as they concisely cover many of the bases when it comes to evaluating results. March 1stGroup meeting. A short discussion about future work took place. For me, I brought up two points that are the most obvious future work for my project. Those being to use a better dataset, which is 70 TB. The next one would be to conduct a user study from my completed work to see if it is easier to use lower complexity machine learning models compared to the meticulously crafted ones that other organizations use. Also, we demonstrated artifacts again. I looked at the SEERS project that dealt with anti-patterns and mutation scoring. I saw the analysis/pre-processing side of the project. I saw what goes into it, as well as some command-line output. I left an issue here TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 8Spring break; catching up The BreakOver spring break I spent most of my time catching up in the places that I was behind. Specifically I was behind in two areas:
For the website, some of the notebooks were still unformatted, and the pages for the models were not finalized in the way that I wanted them to be. I spent a lot of break planning out the content, as well as the organization of these pages to ensure they flowed well and contained the right information. I also spent some time to fill in these pages with actual content, leaving the least of the work for the week or two after the break. For chapter 4, I had not finished it by the start of break, so I took this time to relax and finish writing out the rest of it. I had just finished getting my new results from my refactored datasets, so everything was in order for me to write. The only thing missing from chapter 4 now is some kind of graphical representation of the model, but I am not sure how to do this just yet. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 9Finalize most of the website; start writing chapter 5 Mar 13thMeeting with GK consisted of showcasing the mostly finished website and its layout and content. We also discussed results more, talking about what they mean, and the purpose of them. I felt a little lost this week, so we had to rehash some of these concepts. Another one that was revisited was k-fold cross-validation. The purpose of it, and the metrics, became obscured over time, and I needed a refresher to ensure my chapter 4 content was solid, and that going into chapter 5, I could explain the implications correctly. We also discussed ways to potentially visualize the models, which landed me with a few ideas. Mar 15thIn class meeting had us discussing our final experimental results and their impact and how they connect to project goals. For me, I noted how out of the three algorithms, the random forest is performing the best, but not only that, it is performing very very well. I indicated that the other models were not performing up to standards, however some insights can still be gained from their implementation. The main point I wanted to highlight is that machine learning shows immense promise in the field of predicting renewable energy parameters, so much so that it has to potential to be used in applications. We also gave a brief rundown of some future work applications. For me they consisted of better data gathering, using more models, using the models better, and performing a user study on a potential app built from the models. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 10Finish website completely; Finalize chapter 5; Finalize thesis Mar 20thMeeting with GK. I showcased the "final" website and all of its features. The only piece left was the home page. We also went over the results and implications one last time to ensure everything was in order before turning it in. We talked about what the results mean and rehashed the idea that they are promising and that RF clearly is the winner and the most prominent insight gained from the research project. I also showed my visualization method and it was confirmed that it was a good approach to take due to the scale of the data being graphed. It is hard to visualize it in two dimensions because of how many data points there are and because of the number of input and output features. Overall, everything seems in order and some finalizations need to be made before turning things in. Mar 22ndToday we went over project ideas for each of the 580 students. We went around the table and listened to each of them pitch their idea to us and gave feedback about it. In short, all of the ideas were unfinished or insufficient in some capacity. Whether it was the core of the idea, or the evaluation strategy, something was missing to make it fully complete. The takeaway from the meeting is that creating and idea is HARD. It takes time and careful though to ensure that all pieces of it come together to create something unique, interesting, impactful, and evaluable. TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 11Creating defense presentations Mar 26thMeeting with GK today and we discussed an initial outline for my defense presentation. We talked at a high level about each of the sections to be included. For motivation and goals, the idea and strategy I had in mind was more than sufficient. For the most part, the content that I wanted to include all was fine, however I was missing some crucial pieces. The results section was too broad, in that it did not hone in on the fact that RF is the claim to fame. Also, the implications section says nothing other than ML is promising. More needs to be portrayed here, and we came up with an idea that because the RF model is so good, it can be used for web applications, like PVWatts, to make a difference. It is also important to highlight how we know this is the case. The future work section has good content, but to really drive home the points to be made, it needs to be portrayed and setup in a way that makes the following point: Because of these expressed implications, the foundation is laid for the following future work avenues. This way it does not feel like the future work is being listed in a vacuum, and instead gives purpose to the future work and the project as a whole. Mar 27thMet with GK again today to discuss some slides I threw together the night before, but was unable to finish due to some questions I had. When it came to the methods section, I ran into some trouble in how I should display and talk about the information relating to the algorithms. The crucial information was not enough to fill one slide, so we decided to include a note about ML in general and outline why it was chosen, as well as the kind of ML being used, then the rest of the slide can be dedicated to each algorithm. We also talked about ways to make the figures bigger to ensure they can be seen. The way I was using them had them be too small in some cases, so more careful steps needed to be taken to ensure the figures were big enough. In a similar vein, the results section needed more to indicate that RF is the real focus, and some extra bits to ensure the results are explained some by text. The flow of the presentation was also covered as I did not feel that putting the artifact methodology in the middle fit the flow of the presentation well, so we decided to move it towards the end so the entire presentation flows better. For the demo video, I pitched my idea and it was confirmed that it would be a good way to go, so long as it can be done in 2 minutes. Mar 29thFirst day of defenses! TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 12Prepare for defense; give defense April 5thI gave my defense today. I feel like I did well, even being pressed for time. Many people had follow up questions and told me how much they enjoyed it! Overall, it was exciting and relieving to finally be done (kind of). TODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Week 13 Watching defenses; waiting for feedback on thesis April 12thTODO 💹
|
Beta Was this translation helpful? Give feedback.
-
Hello @AidanNeeson, you have submitted a very thorough research notebook. Thanks for taking a proactive stance on this work, I really appreciated your focus and I enjoyed our research conversations immensely. |
Beta Was this translation helpful? Give feedback.
-
Aidan's Research ✏️
Documentation of my research idea: A green energy predictor!
Beta Was this translation helpful? Give feedback.
All reactions