INDEX
Explanations
references to individuals, particularly those in positions of authority or influence
names of different people
New Auto-Interp
Negative Logits
essional
-0.72
DAQ
-0.69
hops
-0.69
âĸĵ
-0.66
ISTORY
-0.65
ãĥīãĥ©
-0.65
uzz
-0.62
FORMATION
-0.62
Ñĭ
-0.60
bob
-0.60
POSITIVE LOGITS
ongyang
0.81
pai
0.80
artisan
0.80
enei
0.78
Plat
0.77
Stras
0.76
Kant
0.75
ilon
0.74
lett
0.74
sonian
0.73
Activations Density 0.038%