INDEX
Explanations
personal experiences or stories
New Auto-Interp
Negative Logits
rules
-0.61
electromagnetic
-0.60
totality
-0.59
PTS
-0.59
relevance
-0.59
Impact
-0.58
Plaint
-0.57
Allied
-0.55
Gale
-0.55
Georgian
-0.55
POSITIVE LOGITS
'm
1.69
've
1.44
'll
1.23
suppose
1.21
'd
1.21
guess
1.18
am
1.15
ggy
1.11
nex
1.09
dunno
1.08
Activations Density 0.903%