INDEX
Explanations
discussions around academic practices and controversial social topics
New Auto-Interp
Negative Logits
icz
-0.72
oun
-0.72
carbohyd
-0.65
contests
-0.63
atile
-0.62
penal
-0.61
electr
-0.61
heastern
-0.60
rued
-0.60
deserts
-0.60
POSITIVE LOGITS
Includes
1.17
Lots
1.08
Appears
1.08
Possibly
1.07
Seems
1.06
Including
1.05
Especially
1.03
Also
1.03
Probably
1.03
Contains
1.03
Activations Density 0.311%