INDEX
Explanations
numbers with 'minutes', 'days', or 'years'
New Auto-Interp
Negative Logits
unks
-0.78
yz
-0.74
urat
-0.67
Nay
-0.66
Azerb
-0.66
olls
-0.64
ibi
-0.64
Qiao
-0.64
ulhu
-0.62
Pengu
-0.62
POSITIVE LOGITS
necess
0.87
coll
0.86
relations
0.84
communications
0.83
under
0.81
environment
0.81
administ
0.81
communication
0.81
tem
0.80
recent
0.80
Activations Density 0.008%