Not a full answer, but at least some evidence. I calculated the expected value for the few cases:
1: 3.50000 (+3.50000) 2: 8.23611 (+4.73611) 3: 13.42490 (+5.18879) 4: 18.84364 (+5.41874) 5: 24.43605 (+5.59241) 6: 30.15198 (+5.71592) 7: 35.95216 (+5.80018) 8: 41.80969 (+5.85753) 9: 47.70676 (+5.89707) So for example when you have rolled a 6-5-1-1, it is better to re-roll three dice rather than keep the 5, as the expected value for three is more than 5 larger than for two dice.
The code is this Haskell code. It uses dynamic programming, but for each number of dice goes through all possibilities, hence I stopped at 9 dice:
import Numeric.Probability.Example.Dice import Numeric.Probability.Distribution (expected) import Control.Monad import Data.List import Text.Printf probs = map prob [0..] prob 0 = 0 prob n = expected $ do dice <- dice n let sorted = reverse $ sort dice return $ maximum [ fromIntegral (sum (take m sorted)) + (probs !! (n - m)) | m <- [1..n] ] main :: IO () main = forM_ (zip3 [1..9] (tail probs) probs) $ \(n, e, p) -> printf "%d: %8.5f (+%7.5f)\n" (n::Int) (realToFrac e::Double) (realToFrac (e - p)::Double) It seems that the differenced are approaching 6 from below. If that is the case, and they never surpass 6, then the answer to the third question is „no“.