INDEX
Explanations
expressions of personal beliefs or opinions
expressions of personal feelings and identities
New Auto-Interp
Negative Logits
Quotes
-0.68
Saharan
-0.66
Analysis
-0.63
eret
-0.62
Americ
-0.62
Pros
-0.62
Gaul
-0.61
bush
-0.60
Fix
-0.59
Decre
-0.59
POSITIVE LOGITS
ãĥĥãĤ¯
0.74
myself
0.70
ãĥ¼ãĥ«
0.69
arella
0.65
psy
0.65
wondering
0.64
ãĤ´
0.64
aware
0.63
aze
0.63
nervous
0.61
Activations Density 0.082%