INDEX
Explanations
the word "find" followed by different phrases
expressions of personal opinions or evaluations
New Auto-Interp
Negative Logits
rolet
-0.71
luck
-0.64
idium
-0.63
istry
-0.63
isoft
-0.63
bailed
-0.61
oiler
-0.59
concess
-0.57
scrimmage
-0.56
iere
-0.56
POSITIVE LOGITS
myself
0.84
¿½
0.81
phas
0.81
¤
0.76
¶æ
0.74
it
0.73
yourself
0.72
irresistible
0.72
ways
0.72
fault
0.72
Activations Density 0.050%