INDEX
Explanations
words related to individuals, possibly with emphasis on authority figures
nouns and their variations, particularly related to people and roles
New Auto-Interp
Negative Logits
Topic
-0.69
duties
-0.60
+/-
-0.56
lodge
-0.56
prank
-0.56
kernels
-0.56
irlf
-0.56
timer
-0.56
projecting
-0.54
ages
-0.54
POSITIVE LOGITS
Reloaded
0.88
nah
0.87
aban
0.87
pha
0.86
omi
0.84
iq
0.84
aki
0.84
ea
0.82
hol
0.80
cia
0.80
Activations Density 0.118%