INDEX
Explanations
references to emotional experiences and expressions of personal feelings
New Auto-Interp
Negative Logits
isel
-0.19
cartoons
-0.15
ulp
-0.15
urge
-0.14
ocker
-0.14
ãĤ¢ãĥĭãĥ¡
-0.14
漫çĶ»
-0.14
ampa
-0.13
Albums
-0.13
unday
-0.13
POSITIVE LOGITS
shooting
0.72
shoot
0.70
shoots
0.63
shot
0.59
Shooting
0.55
shoot
0.53
Shoot
0.50
Shoot
0.50
shootings
0.49
shots
0.48
Activations Density 0.328%