INDEX
Explanations
names of specific individuals
names of famous artists and public figures
New Auto-Interp
Negative Logits
Newsletter
-0.74
RIC
-0.62
CPR
-0.60
salsa
-0.59
Duterte
-0.58
BLIC
-0.56
contingent
-0.56
CENT
-0.55
recoil
-0.55
Zac
-0.55
POSITIVE LOGITS
uren
0.96
enburg
0.86
eden
0.84
burgh
0.81
ulkan
0.78
eren
0.75
atu
0.74
chn
0.73
ouver
0.72
wen
0.71
Activations Density 0.056%