length-constrained maximum-sum subtree algorithms
- "= \ " to ""
- "\ " to " "
- "=\r\n" to ""
- "\r\n=" to ""
- "[IMAGE]" to ""
- "\r\n" to " "
- "\n" to " " # collapse to one line
Def of word:
- only alphabetic letters: 13041 unique tokens
- otherwise: 21862 unique tokens
- davis utilities san plant plants times million utility blackouts generators commission customers trading companies percent electric officials federal wed edison California eletricity crisis
- ect iso enronxgate amto confidential report draft enroncc susan joe communications ken comments order david june transmission markets language chairman Ken's email
- bush jones president dow stock bank companies trading dynegy confidential news service natural oil credit services copyright deal percent policies Bush and Ken Lay: Slip Slidin' Away
- davis utilities edison billion federal generators utility commission governor plan million crisis san plants electric pay companies thursday iso southern. Davis buy transmission lines
should contain fields:
- message_id
- subject
- body
- timestamp
- datetime(optional)
- sender_id
- recipient_ids
For the last two fields, it can be replaced by participant_ids
gen_cand_trees.sh
and check_events.sh
B_on_algos.sh
: increaing budget vs treesize on various algorithms