INDEX
Explanations
numerical information or quantities given in relation to a certain context or condition
New Auto-Interp
Negative Logits
quit
-0.73
ãĤº
-0.64
anwhile
-0.61
ãĤ¿
-0.60
icides
-0.59
gencies
-0.58
window
-0.58
âĵĺ
-0.57
happiest
-0.56
tions
-0.55
POSITIVE LOGITS
partially
1.07
partly
0.96
SOME
0.86
partial
0.84
temporarily
0.83
half
0.79
one
0.77
some
0.77
twice
0.75
tacit
0.73
Activations Density 0.043%