INDEX
Explanations
statements related to actions performed by individuals or groups
New Auto-Interp
Negative Logits
assi
-0.78
orum
-0.60
risen
-0.57
Breaker
-0.57
usha
-0.56
Shining
-0.55
ses
-0.55
Flavoring
-0.54
usefulness
-0.54
runway
-0.54
POSITIVE LOGITS
concurrently
1.14
electronically
1.06
privately
1.05
indoors
1.01
anonymously
1.00
collabor
0.98
differently
0.96
exclusively
0.95
cheaply
0.94
outdoors
0.93
Activations Density 0.344%