INDEX
Explanations
phrases indicating personal experiences or interactions
New Auto-Interp
Negative Logits
accordingly
-0.65
lying
-0.65
arry
-0.64
interstitial
-0.64
umbing
-0.63
peak
-0.62
iter
-0.62
anwhile
-0.62
thats
-0.61
annel
-0.60
POSITIVE LOGITS
opportunity
1.39
privilege
1.28
misfortune
1.21
pleasure
1.13
courage
1.07
chance
1.06
option
1.03
utmost
0.99
displeasure
0.97
guts
0.97
Activations Density 0.097%