-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtodo.txt
61 lines (55 loc) · 2.94 KB
/
todo.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Current work:
The JJLattice.
- Sampling method
- detect experiment convergence
- evaluate accuracy at various numbers of experiments as a start
- evaluate accuracy of full distribution
- make it easier to specify that JJLattice is to be used; have LatticeFactory take a configuration object.
TODOs of various priorities:
- Implement centering, via Johnsen/Johansson
- Investigate proper sampling method and experiment convergence for JJLattice
- Iterables.partition in DistributedLattice is probably a little slow
- Need to do design review
- Why keep count in supracontext instead of storing it elsewhere?
- Why save supracontexts in constructor of lattices?
- further parallelize DistributedLattice- divide combinations into groups of 1000
- progress tracker for large problems; then we can know if what we're trying to do is just too hard.
- Which lattice implementation to use should be configurable.
- Concept probably shouldn't implement Supracontext.
- write tests for Concept
- write test for LatticeFactory
- write test for LabelerFactory
- implement the usage of m_removeTestExemplar in AnalogicalModeling.java
- perhaps an option sent to SubcontextList?
- experiment with sorting the subcontext list before lattice filling
- most to least mismatches (1's)? Or is random better?
- non-deterministic or rare outcomes first should be better everywhere.
- distributed lattice should die if labeler says not to split
- toString for Labeler classes would be nice
- label splitting should use the strategy pattern
- always have X number of lattices
- lattices are always X size
- choose some size combo based on characteristics of the subcontext list?
- should be able to vary subcontext list implementation
- try out different types of sorting
- change "outcome" to "class" everywhere
- shield out weka classes!
+ stop using doubles for outcomes (instead of integers)
- add a "subFromString" method in the test utils class?
- test specifically the case where there is a tie for class outcome
- clean up tests; give them better names, etc. Use Hamcrest matches to make it all more legible.
- add high-level algorithm explanations for the lattice, etc.
- package-info.java?
Engineering Tasks
- get it working with bigger data sets
SparseLattice:
- fix count, which is incorrect currently because we are not considering grandparents, etc.
- count needs to consider all ancestors, probably meaning we need to update it dynamically during addIntent.
- perhaps we need child and parent nodes; updated count means updating all children.
- flooding algorithm?
Later deliverables:
- lattice viewer (show supras in a tree; GraphViz or Processing or Tikz or something)
- support other types of features as discussed in http://humanities.byu.edu/am/leipzig_margins.pdf.
Thoughts:
- is there a connection between k-d trees and AM's splitting into sub contexts?
- what would it take to make AM more of an updateable classifier, making changes to the lattice as more data is recieved?