In the paper a framework for generating a locally controlled arithmetic unit is presented including graph generation from a mathematical expression, graph partitioning to determine locally controlled parts of the design and VHDL generation. The output of the framework is a pipelined architecture containing locally controlled groups of floating point units. It is demonstrated that both partitioning and placement aspects of the design have to be considered to obtain a highspeed circuit. In a well-placeable design locally controlled groups can be mapped to FPGA in such a way that only neighboring groups communicate with each other. In the presented algorithm an initial floorplan of the floating point units is produced and a novel graph partitioning representation is used for partitioning the floating point units to obtain a well-placeable design. The framework is demonstrated during the automatic circuit generation of a complex mathematical expression related to Computation Fluid Dynamics (CFD). The framework produces 15-27% faster design than the unpartitioned, globally controlled one in the price of a modest area increase. The framework automatically produces well-placeable deadlock-free partitions for complex expressions as well, while in case of traditional partitioners these objectives cannot be targeted.