O que determina o número de threads que um Java ForkJoinPool cria?

Question

May 29, 2012, 12:44 PM

java fork-join threadpool parallel-processing

O que determina o número de threads que um Java ForkJoinPool cria?

Tanto quanto eu tinha entendidoForkJoinPool, esse conjunto cria um número fixo de encadeamentos (padrão: número de núcleos) e nunca criará mais encadeamentos (a menos que o aplicativo indique a necessidade deles usandomanagedBlock).

No entanto, usandoForkJoinPool.getPoolSize() Eu descobri que em um programa que cria 30.000 tarefas (RecursiveAction), aForkJoinPool a execução dessas tarefas usa 700 threads em média (os threads são contados toda vez que uma tarefa é criada). As tarefas não fazem I / O, mas pura computação; a única sincronização entre tarefas está chamandoForkJoinTask.join() e acessandoAtomicBooleans, isto é, não há operações de bloqueio de encadeamento.

Desde ajoin() não bloqueia o thread de chamada como eu o entendo, não há nenhuma razão para que qualquer thread no pool nunca bloqueie, e assim (eu tinha assumido) não deveria haver razão para criar threads adicionais (o que obviamente está acontecendo no entanto).

Então, por queForkJoinPool criar tantos tópicos? Quais fatores determinam o número de threads criados?

Eu esperava que essa pergunta pudesse ser respondida sem postar código, mas aqui vem a pedido. Este código é um trecho de um programa de quatro vezes o tamanho, reduzido às partes essenciais; não compila como é. Se desejar, posso postar o programa completo também.

O programa pesquisa um labirinto por um caminho desde um determinado ponto inicial até um determinado ponto final, usando a pesquisa em profundidade. Uma solução é garantida para existir. A lógica principal está nocompute() método deSolverTask: UMARecursiveAction que começa em algum ponto dado e continua com todos os pontos vizinhos alcançáveis a partir do ponto atual. Em vez de criar um novoSolverTask em cada ponto de ramificação (o que criaria muitas tarefas), ele empurra todos os vizinhos, com exceção de um, para uma pilha de retrocesso a ser processada posteriormente e continua com apenas um vizinho não empurrado para a pilha. Quando chega a um beco sem saída, o ponto mais recentemente empurrado para a pilha de backtracking é estourado, e a busca continua de lá (cortando o caminho construído a partir do ponto de partida do taks). Uma nova tarefa é criada quando uma tarefa encontra sua pilha de retrocesso maior que um determinado limite; a partir desse momento, a tarefa, embora continue aparecendo a partir de sua pilha de backtracking até que esteja esgotada, não empurrará nenhum outro ponto para sua pilha ao alcançar um ponto de ramificação, mas criará uma nova tarefa para cada ponto. Assim, o tamanho das tarefas pode ser ajustado usando o limite de limite de pilha.

Os números que citei acima ("30.000 tarefas, 700 threads em média") são da pesquisa de um labirinto de 5000x5000 células. Então, aqui está o código essencial:

class SolverTask extends RecursiveTask<ArrayDeque<Point>> {
// Once the backtrack stack has reached this size, the current task
// will never add another cell to it, but create a new task for each
// newly discovered branch:
private static final int MAX_BACKTRACK_CELLS = 100*1000;

/**
 * @return Tries to compute a path through the maze from local start to end
 * and returns that (or null if no such path found)
 */
@Override
public ArrayDeque<Point>  compute() {
    // Is this task still accepting new branches for processing on its own,
    // or will it create new tasks to handle those?
    boolean stillAcceptingNewBranches = true;
    Point current = localStart;
    ArrayDeque<Point> pathFromLocalStart = new ArrayDeque<Point>();  // Path from localStart to (including) current
    ArrayDeque<PointAndDirection> backtrackStack = new ArrayDeque<PointAndDirection>();
    // Used as a stack: Branches not yet taken; solver will backtrack to these branching points later

    Direction[] allDirections = Direction.values();

    while (!current.equals(end)) {
        pathFromLocalStart.addLast(current);
        // Collect current's unvisited neighbors in random order: 
        ArrayDeque<PointAndDirection> neighborsToVisit = new ArrayDeque<PointAndDirection>(allDirections.length);  
        for (Direction directionToNeighbor: allDirections) {
            Point neighbor = current.getNeighbor(directionToNeighbor);

            // contains() and hasPassage() are read-only methods and thus need no synchronization
            if (maze.contains(neighbor) && maze.hasPassage(current, neighbor) && maze.visit(neighbor))
                neighborsToVisit.add(new PointAndDirection(neighbor, directionToNeighbor.opposite));
        }
        // Process unvisited neighbors
        if (neighborsToVisit.size() == 1) {
            // Current node is no branch: Continue with that neighbor
            current = neighborsToVisit.getFirst().getPoint();
            continue;
        }
        if (neighborsToVisit.size() >= 2) {
            // Current node is a branch
            if (stillAcceptingNewBranches) {
                current = neighborsToVisit.removeLast().getPoint();
                // Push all neighbors except one on the backtrack stack for later processing
                for(PointAndDirection neighborAndDirection: neighborsToVisit) 
                    backtrackStack.push(neighborAndDirection);
                if (backtrackStack.size() > MAX_BACKTRACK_CELLS)
                    stillAcceptingNewBranches = false;
                // Continue with the one neighbor that was not pushed onto the backtrack stack
                continue;
            } else {
                // Current node is a branch point, but this task does not accept new branches any more: 
                // Create new task for each neighbor to visit and wait for the end of those tasks
                SolverTask[] subTasks = new SolverTask[neighborsToVisit.size()];
                int t = 0;
                for(PointAndDirection neighborAndDirection: neighborsToVisit)  {
                    SolverTask task = new SolverTask(neighborAndDirection.getPoint(), end, maze);
                    task.fork();
                    subTasks[t++] = task;
                }
                for (SolverTask task: subTasks) {
                    ArrayDeque<Point> subTaskResult = null;
                    try {
                        subTaskResult = task.join();
                    } catch (CancellationException e) {
                        // Nothing to do here: Another task has found the solution and cancelled all other tasks
                    }
                    catch (Exception e) {
                        e.printStackTrace();
                    }
                    if (subTaskResult != null) { // subtask found solution
                        pathFromLocalStart.addAll(subTaskResult);
                        // No need to wait for the other subtasks once a solution has been found
                        return pathFromLocalStart;
                    }
                } // for subTasks
            } // else (not accepting any more branches) 
        } // if (current node is a branch)
        // Current node is dead end or all its neighbors lead to dead ends:
        // Continue with a node from the backtracking stack, if any is left:
        if (backtrackStack.isEmpty()) {
            return null; // No more backtracking avaible: No solution exists => end of this task
        }
        // Backtrack: Continue with cell saved at latest branching point:
        PointAndDirection pd = backtrackStack.pop();
        current = pd.getPoint();
        Point branchingPoint = current.getNeighbor(pd.getDirectionToBranchingPoint());
        // DEBUG System.out.println("Backtracking to " +  branchingPoint);
        // Remove the dead end from the top of pathSoFar, i.e. all cells after branchingPoint:
        while (!pathFromLocalStart.peekLast().equals(branchingPoint)) {
            // DEBUG System.out.println("    Going back before " + pathSoFar.peekLast());
            pathFromLocalStart.removeLast();
        }
        // continue while loop with newly popped current
    } // while (current ...
    if (!current.equals(end)) {         
        // this task was interrupted by another one that already found the solution 
        // and should end now therefore:
        return null;
    } else {
        // Found the solution path:
        pathFromLocalStart.addLast(current);
        return pathFromLocalStart;
    }
} // compute()
} // class SolverTask

@SuppressWarnings("serial")
public class ParallelMaze  {

// for each cell in the maze: Has the solver visited it yet?
private final AtomicBoolean[][] visited;

/**
 * Atomically marks this point as visited unless visited before
 * @return whether the point was visited for the first time, i.e. whether it could be marked
 */
boolean visit(Point p) {
    return  visited[p.getX()][p.getY()].compareAndSet(false, true);
}

public static void main(String[] args) {
    ForkJoinPool pool = new ForkJoinPool();
    ParallelMaze maze = new ParallelMaze(width, height, new Point(width-1, 0), new Point(0, height-1));
    // Start initial task
    long startTime = System.currentTimeMillis();
     // since SolverTask.compute() expects its starting point already visited, 
    // must do that explicitly for the global starting point:
    maze.visit(maze.start);
    maze.solution = pool.invoke(new SolverTask(maze.start, maze.end, maze));
    // One solution is enough: Stop all tasks that are still running
    pool.shutdownNow();
    pool.awaitTermination(Integer.MAX_VALUE, TimeUnit.DAYS);
    long endTime = System.currentTimeMillis();
    System.out.println("Computed solution of length " + maze.solution.size() + " to maze of size " + 
            width + "x" + height + " in " + ((float)(endTime - startTime))/1000 + "s.");
}