FRESH_17_Topic_Modeling

Below are documentations of codes used for FRESH_17 audio diary transcript analysis regarding topic modeling.

Set up

Run the shell scripts on the cluster to preprocess the transcripts into files that can be converted to corpus in TM analysis. Run the R codes on your local computer for TM analysis (have not resolve how to run R on the cluster).

Flow of execution

$ bash topic_modeling_transcript_level.sh STUDYNAME SUBJECT

calls phone_transcript_allFeatures.py if allFeatures doesn't exist
- outputs allFeatures containing audio qc, transcript qc, and NLP summary.
calls transcript_level_text.py
- outputs processed daily transcript texts for the subject (Have the potential to be modified to output weekly-level texts).

$ bash topic_modeling_subject_level.sh STUDYNAME

removes previously generated output file if it exists.
calls subject_level_text.py
- loops through and summarizes all avaliable daily transcript texts of different subjects

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
FRESH_17_text		FRESH_17_text
outputs		outputs
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
TopicModeling.Rproj		TopicModeling.Rproj
topic_modeling_subject_level.sh		topic_modeling_subject_level.sh
topic_modeling_transcript_level.sh		topic_modeling_transcript_level.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FRESH_17_Topic_Modeling

Set up

Flow of execution

About

Releases

Packages

Languages

jennieli421/FRESH_17_Topic_Modeling

Folders and files

Latest commit

History

Repository files navigation

FRESH_17_Topic_Modeling

Set up

Flow of execution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages