INDEX
Explanations
references to historical figures and names in a specific context
New Auto-Interp
Negative Logits
Og
-0.41
vpon
-0.40
سكانية
-0.39
downe
-0.37
eroen
-0.37
mately
-0.37
glab
-0.36
aroos
-0.36
Og
-0.36
наче
-0.36
POSITIVE LOGITS
IVEREF
0.56
richest
0.53
saddest
0.51
hardest
0.50
toughest
0.50
coldest
0.50
Newest
0.49
darkest
0.49
poorest
0.49
farthest
0.48
Activations Density 0.093%