INDEX
Explanations
dates in historical contexts
numerical values or statistics
New Auto-Interp
Negative Logits
seams
-0.74
lobb
-0.65
trophies
-0.64
roofs
-0.61
noses
-0.61
tongues
-0.60
streaks
-0.59
setting
-0.59
nose
-0.59
balcon
-0.58
POSITIVE LOGITS
]
1.11
][
1.07
]).
1.01
]"
0.97
].
0.93
],[
0.92
]:
0.91
])
0.90
]),
0.85
];
0.85
Activations Density 0.038%