INDEX
Explanations
phrases related to providing information or suggestions
sentences that indicate statements or conclusions
New Auto-Interp
Negative Logits
lyak
-0.65
emic
-0.65
obliter
-0.64
onga
-0.63
manif
-0.61
icz
-0.60
victory
-0.60
criminal
-0.60
vation
-0.59
decap
-0.59
POSITIVE LOGITS
tumblr
1.08
htm
1.01
Includes
1.00
Specifically
1.00
Besides
0.98
blogspot
0.98
Additionally
0.97
Typically
0.96
Examples
0.94
Especially
0.93
Activations Density 0.366%