INDEX
Explanations
words related to expressing thoughts or opinions
references to personal thoughts and opinions
New Auto-Interp
Negative Logits
ALL
-0.78
Mamm
-0.74
toe
-0.65
ARDS
-0.65
Breach
-0.63
Naz
-0.61
Adin
-0.59
Peninsula
-0.58
two
-0.58
Annex
-0.57
POSITIVE LOGITS
fulness
0.97
ileaks
0.92
aloud
0.85
cience
0.83
thoughts
0.83
ets
0.80
mith
0.78
reprene
0.77
provoking
0.77
ynthesis
0.77
Activations Density 0.022%