INDEX
Explanations
names of individuals or specific people
specific classifications or categories related to artistic works or elements
New Auto-Interp
Negative Logits
ĸļ
-0.73
unic
-0.67
pora
-0.67
havens
-0.63
pees
-0.63
quar
-0.62
err
-0.62
balloons
-0.59
colonies
-0.59
surg
-0.59
POSITIVE LOGITS
illin
0.75
Centauri
0.72
iddled
0.63
phe
0.62
ript
0.61
lehem
0.60
Baghd
0.60
å§
0.60
lamm
0.59
ת
0.59
Activations Density 0.259%