INDEX
Explanations
various types of human activities
mentions of various activities
New Auto-Interp
Negative Logits
ixed
-0.70
ocard
-0.69
oiler
-0.68
passage
-0.67
Enough
-0.66
oning
-0.63
cracked
-0.63
uran
-0.62
Boss
-0.62
AU
-0.61
POSITIVE LOGITS
activities
1.02
ional
0.95
undertaken
0.89
eering
0.86
ivism
0.84
rador
0.80
ational
0.76
anooga
0.75
activity
0.75
Activities
0.73
Activations Density 0.028%