INDEX
Explanations
phrases instructing to pay attention to specific things or events
phrases related to watching or monitoring something attentively
New Auto-Interp
Negative Logits
venge
-0.80
Gibbs
-0.66
Guitar
-0.65
HAHAHAHA
-0.63
Grade
-0.62
bet
-0.61
Whites
-0.61
DNA
-0.61
hens
-0.61
ĪĴ
-0.60
POSITIVE LOGITS
âĦ¢:
0.74
ahead
0.73
eful
0.73
gency
0.71
ulatory
0.70
emark
0.70
ãĤ¶
0.67
oin
0.67
agascar
0.66
uum
0.66
Activations Density 0.031%