INDEX
Explanations
adjectives describing people
references to people and their characteristics
New Auto-Interp
Negative Logits
seys
-0.68
ansk
-0.67
VIDEOS
-0.67
iners
-0.65
uld
-0.64
uden
-0.64
undreds
-0.64
çīĪ
-0.61
ousands
-0.61
Mehran
-0.61
POSITIVE LOGITS
whose
0.98
who
0.94
whom
0.89
nonetheless
0.86
nered
0.84
indeed
0.80
devoid
0.76
capable
0.75
endowed
0.74
possessing
0.73
Activations Density 0.199%