INDEX
Explanations
adjectives describing characteristics or actions of people
references to groups of people and their actions or beliefs
New Auto-Interp
Negative Logits
ĸļ
-0.95
enegger
-0.79
=~
-0.73
TAMADRA
-0.72
atever
-0.69
................
-0.64
externalActionCode
-0.63
-+
-0.61
Printing
-0.60
=-=-=-=-=-=-=-=-
-0.60
POSITIVE LOGITS
rosso
0.77
hops
0.71
ilo
0.70
trop
0.67
perm
0.66
uce
0.65
inelli
0.63
enos
0.63
hop
0.61
assium
0.60
Activations Density 0.488%