INDEX
Explanations
phrases that indicate frequency or occurrence of actions
New Auto-Interp
Negative Logits
VersionUID
-0.49
quete
-0.48
uarie
-0.46
ales
-0.44
Disliked
-0.44
findpost
-0.42
învă
-0.42
ั่ง
-0.42
("]");-0.41
droj
-0.41
POSITIVE LOGITS
writeFieldEnd
0.99
everytime
0.94
whenever
0.90
whenever
0.85
Whenever
0.83
every
0.83
Whenever
0.82
every
0.79
Every
0.71
EVERY
0.69
Activations Density 0.208%