INDEX
Explanations
actions or statements related to understanding or purposeful behavior
phrases related to actions and activities being performed
New Auto-Interp
Negative Logits
mone
-0.67
reet
-0.66
ases
-0.66
laun
-0.65
ikers
-0.65
Pon
-0.59
©¶æ
-0.59
fitting
-0.58
asers
-0.57
iability
-0.57
POSITIVE LOGITS
here
0.75
now
0.73
Ø©
0.73
tonight
0.72
NOW
0.68
today
0.68
iatus
0.67
igating
0.66
hani
0.65
HERE
0.63
Activations Density 0.119%