INDEX
Explanations
terms related to profanity or profane language
references to professional contexts or occupations
New Auto-Interp
Negative Logits
CAR
-0.67
Pinball
-0.67
Amend
-0.65
Adds
-0.64
akin
-0.64
osite
-0.63
Rabbit
-0.62
Avalon
-0.62
Coil
-0.62
actly
-0.62
POSITIVE LOGITS
prof
4.20
prof
2.17
Prof
1.61
Prof
1.55
profiling
1.33
exp
1.14
obsc
1.07
vulgar
1.02
corpor
0.98
perf
0.98
Activations Density 0.026%