INDEX
Explanations
adjectives or verb phrases indicating judgments or opinions
words and phrases related to existence, reality, and belief assertions
New Auto-Interp
Negative Logits
çīĪ
-0.66
rises
-0.64
aughters
-0.64
ofi
-0.63
winds
-0.62
imar
-0.62
eatures
-0.62
ispers
-0.58
refres
-0.58
cellaneous
-0.58
POSITIVE LOGITS
?,
1.01
izable
0.94
anymore
0.87
somehow
0.82
acea
0.79
enough
0.79
/,
0.79
someday
0.77
achable
0.73
.*
0.72
Activations Density 0.627%