INDEX
Explanations
phrases related to actions taken or being taken on something
words related to physical contact, handling, or manipulation
New Auto-Interp
Negative Logits
retri
-0.65
minist
-0.59
女
-0.57
scrimmage
-0.57
upon
-0.57
Siber
-0.56
ä
-0.56
ADRA
-0.56
Balt
-0.55
Antar
-0.55
POSITIVE LOGITS
olicy
0.90
terday
0.80
acket
0.80
odcast
0.76
undown
0.74
ules
0.72
berra
0.71
onent
0.71
inion
0.71
rodu
0.71
Activations Density 0.016%