INDEX
Explanations
phrases describing individuals who perform specific actions or hold certain roles
references to individuals who perform notable actions or roles
New Auto-Interp
Negative Logits
ply
-0.78
Bron
-0.72
fml
-0.71
van
-0.68
early
-0.66
aned
-0.63
enei
-0.61
ount
-0.61
backs
-0.61
vette
-0.61
POSITIVE LOGITS
Teresa
0.71
manufact
0.69
iest
0.68
Gamergate
0.68
tallest
0.67
IG
0.67
Tata
0.66
hardest
0.66
Restore
0.65
Alc
0.65
Activations Density 0.376%