# Area-Power Analysis of FFT Based Digital Beamforming for GEO, MEO, and LEO Scenarios

Rakesh Palisetty\*, Geoffrey Eappen\*, Jorge Luis Gonzalez Rios\*, Juan Carlos Merlano Duncan\*,

Stavros Domouchtsidis<sup>\*</sup>, Symeon Chatzinotas<sup>\*</sup>, Björn Ottersten<sup>\*</sup>,

Bingen Cortazar<sup>†</sup>, Salvatore D'Addio<sup>†</sup> and Piero Angeletti<sup>†</sup>

\*Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg

<sup>†</sup>European Space Agency, The Netherlands

Email: (rakesh.palisetty,geoffrey.eappen,jorge.gonzalez,juan.duncan,symeon.chatzinotas,bjorn.ottersten)@uni.lu, (Bingen.Cortazar,Salvatore.Daddio,Piero.Angeletti)@esa.int

Abstract-Satellite communication systems can provide seamless wireless coverage directly or through complementary groundterrestrial components and are projected to be incorporated into future wireless networks, particularly 5G and beyond networks. Increased capacity and flexibility in telecom satellite payloads based on classic radio frequency technology have traditionally translated into increased power consumption and dissipation. Much of the analog hardware in a satellite communications payload can be replaced with highly integrated digital components that are often smaller, lighter, and less expensive, as well as software reprogrammable. Digital beamforming of thousands of beams simultaneously is not practical due to the limited power available onboard satellite processors. Reduced digital beamforming power consumption would enable the deployment of a full digital payload, resulting in comprehensive user applications. Beamforming can be implemented using matrix multiplication, hybrid methodology, or a discrete Fourier transform (DFT). Implementing DFT via fast Fourier transform (FFT) reduces the power consumption, process time, hardware requirements, and chip area. Therefore, in this paper, area-power efficient FFT architectures for digital beamforming are analyzed. The area in terms of look up tables (LUTs) is estimated and compared among conventional FFT, fully unrolled FFT, and a 4-bit quantized twiddle factor (TF) FFT. Further, for the typical satellite scenarios, area, and power estimation are reported.

*Index Terms*—Array factor, Beamforming, Look up table, Quantization, Power estimation,

## I. INTRODUCTION

With the advancement of wireless communication towards the beyond fifth generation (B5G) technologies, satellite communication (SatCom) needs to provide the services like highspeed internet and other multimedia services to remote areas [1]. Currently and in the future, very high throughput payloads must simultaneously create thousands of beams across the coverage region, necessitating a massive aggregated beamforming capacity. Given the strict and accurate design requirements for most analog radio frequency chains for space applications, reconfiguring analog hardware is often constrained. The fully analog beamformers are not very common since they are very bulky, lack flexibility, and have high power consumption [2]. On the other hand, using a direct radiating array (DRA) is not a straightforward task. Nowadays, the only practical alternative to achieve desired gain in many scenarios is by employing feeder and reflector schemes.

In addition, the hybrid beamforming approaches are the only alternative to implement DRAs [3]. But this hybrid beamforming comes with a trade-off between complexity and flexibility [4]. Digital hardware advancements rapidly lower the cost and increase the capability of digital components needed to construct digital beamformers. A fully digital beamformer can form high gain multiple beams without downgrading the signal to interference noise ratio (SINR) and have complete flexibility in terms of beam steering and power allocation [5]. Due to the limited power available onboard, digital beamforming of such a vast aggregated capacity is now not practical. As a result, onboard processors can only handle a small percentage of the overall system capacity for digital beamforming. Reduced digital beamforming power consumption would enable the deployment of a full digital payload, resulting in a capacity allocation flexibility to serve wide user applications [6]. This calls for the need for an area and power-efficient beamformer, keeping the flexibility of the fully digital solutions.

The digital beamforming can be implemented with a matrixby-vector multiplication operation in a brute-force approach [7]. The number of multiplication increase quadratically with the number of antennas and input streams. And also, each scalar multiplication is power and area-hungry operation in hardware. The natural solution for the implementation of digital beamforming when using uniformly linear or rectangular arrays is the use of the discrete Fourier transform (DFT) as the beamforming matrix [8]. Subsequently, the DFT can be implemented efficiently using the fast Fourier transform (FFT) algorithm [9].

The FFT beamformer allows the better realization of realtime beamformers with low circuit complexity and power consumption less than the matrix-by-vector multiplication beamforming. The N point FFT can efficiently generate Nbeams which are placed quasi uniformly in the angular domain. However, the FFT beamforming is not flexible since it will produce a fixed-beam pattern. On the other hand, if the FFT size is enlarged in comparison to the number of antennas, the payload will be able to synthesize a larger number of overlapping beams, which provides a higher granularity for the beam pointing. This enhancement comes at the price of the increased complexity of the larger FFT size [10].

Therefore, in this paper, we have estimated the area in terms of look up tables (LUTs) consumed by different FFT architectures. Initially, we analyzed the conventional FFT, which processes one input sample per cycle of operation in a rolled fashion. Later on, a fully unrolled FFT is analyzed that processes N samples simultaneously. Implementing a fully unrolled FFT leads to the high area and power consumption. So to further reduce area and power consumption in a fully unrolled FFT, a 4-bit quantized twiddle factor (TF) FFT is proposed. The area in terms of LUTs is estimated for different FFT sizes, and the beamforming plot for conventional FFT and 4-bit quantized TF FFT is analyzed. Furthermore, the area and power are estimated for the diverse array sizes suitable to typical geostationary equatorial orbit (GEO), medium earth orbit (MEO), and lower earth orbit (LEO) satellite scenarios.

The rest of the paper is organized as follows. Section II gives the area efficient analysis of FFT choices. Then, the beamforming analysis of conventional FFT with 4-bit quantized TF FFT is presented in Section III. Finally, a preliminary assessment of the complexity for the FFT based onboard beamforming in different scenarios is presented Section IV, followed by some concluding remarks in Section V.

#### II. AREA EFFICIENT FFT ANALYSIS

Two-dimensional (2D) FFT used in the multi-beam beamformer is performed through two one-dimensional (1D) FFTs. First, the row-wise 1D FFT operation is performed, and the obtained result is transposed, followed by column-wise 1D FFT operation with the result transposed again [9]. Although this type of FFT beamforming has several good properties and can be implemented with the help of FFT algorithms, it might still be too complex for specific satellite applications where it is essential to have an efficient use of on-board resources like power consumption and mass. The main problem is the presence of multiplication operations due to the twiddle factors (TF) that consumes more resources.

In commercially available products for application specific integrated circuit (ASIC) or field programmable gate array (FPGA), the FFT processing processes one input sample per cycle of operation in a rolled fashion. Typically, all the stages are pipelined to achieve the target throughput of one sample per cycle (usually a clock cycle) [11], [12]. Using this type of architecture will require replicating the hardware N times is N samples are needed simultaneously (as is the case for beamforming). Additionally, here for the rolled implementation, the hardware is re-used in each cycle, which requires a fully operational multiplier operation since the TFs will change from one cycle to the next. Those TFs are stored typically on memory and are read upon request. A fully unrolled FFT can generate N samples simultaneously [13].

The computational complexity of DFT requires  $N \times N$ multiplications and  $(N-1) \times N$  additions. Therefore, as an alternative and efficient computation to the existing technique of DFT, FFT architectures are adopted in the beamforming to reduce the computation complexity of multiplications to  $N/2 \times \log_2(N)$  and additions to  $N \times \log_2(N)$ . FFT is a fast algorithm of DFT computation that is realized using radix-2, radix-4, and radix-8 architectures [14].

For radix-2 FFT:

Total number of Multiplications =  $N/2 \times \log_2(N)$ . With N/2 multiplications per stage.

Total number of Additions =  $N \times \log_2(N)$ . With N additions per stage.

For radix-4 FFT:

The number of stages is  $\log_4(N)$ , and each stage contains N/4 butterfly blocks. Each of these single butterflies consumes twelve complex adders and three complex multipliers. By appropriately arranging the butterfly structure, it is noticed that a total of eight complex adders and three complex multipliers are required. Therefore,

Total number of Multiplications =  $3 \times N/4 \times \log_4(N)$ 

Total number of Additions =  $8 \times N/4 \times \log_4(N)$ , also equals to  $N \times \log_2(N)$ 

It is observed that radix-4 requires only 75% of complex multipliers compared to the exact logic implementation of radix-2 FFT. It is also noted that the number of adders in both cases remains the same. Therefore, the radix-4 architecture is preferred as a more efficient alternative for the signal processing community. Further in this paper, an area-power analysis of conventional FFT [15], fully unrolled FFT [16], [17], and a 4-bit quantized TF FFT is carried out to estimate the area and power consumption for different array sizes suitable to GEO, MEO, and LEO scenarios.

#### A. Estimation of LUTs for conventional radix-4 FFT

The multipliers discussed while comparing radix-2 and radix-4 are complex multipliers. Those multipliers need to be addressed in real multipliers and real adders to estimate the complexity accurately. The conventional representation of one complex multiplier is equivalent to four real multipliers and two real adders, i.e.,

 $(C+iS) \times (X+iY) = (CX - SY) + i(CY + SX)$ 

As noted above, one complex multiplier requires four real multipliers and two real adders/subtractors. Note that the hard-ware resources consumed by real multipliers are significantly higher than real adders. Hence, it would be beneficial to represent it with fewer multipliers. Also, one complex adder is equal to two real adders. The above complex multiplier can also be optimized to represent in three real multipliers and five real adders [18].

Case I: Conventional way of implementing complex multiplier, i.e., one complex multiplier is equal to four real multipliers and two real adders. For a single butterfly stage in radix-4,

Real Multipliers =  $4 \times 3N/4 + 2 \times 3N/4 = 12N/4 + 6N/4$ Real Adders =  $2 \times 8N/4 = 16N/4$ 

Therefore, the complexity of each stage is 12N/4 (real multipliers) and 22N/4 (real adders).

Case II: One complex multiplier equal to three real multipliers and five real adders

Real Multipliers =  $3 \times 3N/4 + 5 \times 3N/4 = 9N/4 + 15N/4$ Real Adders =  $2 \times 8N/4 = 16N/4$ 

Therefore, the complexity of one stage is 9N/4 (real multipliers) and 31N/4 (real adders). For the rest of the paper, case II is considered for estimating the real multipliers. Table I provides an estimation of multipliers and adders in terms of LUTs. The computations consider an input bit size of 16bit with TF multiplication of 16-bit with no bit-growth. Each adder output occupies 16 LUTs. For the multiplier case, a total of 240 LUTs, i.e., fifteen times adder operations, are performed on 16 rows (for  $16 \times 16$ ). So, each adder operation consumes 16 LUTs, and thereby for achieving a  $16 \times 16$ multiplication, a total of 240 LUTs ( $16 \times (16-1)$ ) are required.

 TABLE I

 COMPLEXITY ESTIMATION OF THE CONVENTIONAL FFT

| FFT Size | Real   | Adders | Real        | Multipliers | Total   |
|----------|--------|--------|-------------|-------------|---------|
|          | Adders | LUTs   | Multipliers | LUTs        | LUTs    |
| 4        | 31     | 496    | 9           | 2160        | 2656    |
| 16       | 248    | 3968   | 72          | 17280       | 21248   |
| 64       | 1488   | 23808  | 432         | 103680      | 127488  |
| 256      | 7936   | 126976 | 2304        | 552960      | 679936  |
| 1024     | 39680  | 634880 | 11520       | 2764800     | 3399680 |

# B. Estimation of LUTs for fully unrolled radix-4 FFT

The theoretical total number of multipliers and adders are equivalent to  $3 \times N/4 \times \log_4(N)$  and  $N \times \log_2(N)$ . But upon careful examination, when using a fully unrolled architecture, the estimated number of multipliers is less than the theoretical computation. Additionally, using a fully unrolled architecture will provide an additional resource reduction for the cases when not all the outputs of the FFT are taken to the digital and radio frequency chains. During the analysis, the multipliers involving multiplication with TF  $W_0^N = 1$  are eliminated since they will certainly not be done in an efficient implementation. It is simply the data multiplied by one. Upon careful realization of fully unrolled radix-4 architecture for 4point FFT, it requires eight complex adders and no complex multipliers. For the case of 16-point radix-4 FFT, nine complex multipliers and sixty-four real adders are required. The detailed complexity comparison for different FFT sizes of radix-4 is mentioned in Table II.

 TABLE II

 COMPLEXITY COMPARISON OF CONVENTIONAL AND FULLY UNROLLED

 RADIX-4 FFT

|          | Conventional | Conventional | Optimized   | Optimized |
|----------|--------------|--------------|-------------|-----------|
| FFT Size | Complex      | Complex      | Complex     | Complex   |
|          | Multipliers  | Adders       | Multipliers | Adders    |
| 4        | 3            | 8            | 0           | 8         |
| 16       | 24           | 64           | 9           | 64        |
| 64       | 144          | 384          | 81          | 384       |
| 256      | 768          | 2048         | 513         | 2048      |
| 1024     | 3840         | 10240        | 2817        | 10240     |

Table III provides an approximate estimation of multipliers and adders in terms of LUTs. Considering input bit size of 16-bit with TF multiplication of 16-bit, each adder output occupies 16 LUTs, and multiplier output occupies 240 LUTs.

| TABLE III  |            |          |              |  |  |
|------------|------------|----------|--------------|--|--|
| COMPLEXITY | ESTIMATION | OF FULLY | UNROLLED FFT |  |  |

| EET Size | Real   | Adders | Real        | Multipliers | Total   |
|----------|--------|--------|-------------|-------------|---------|
| ITT SIZE | Adders | LUTs   | Multipliers | LUTs        | LUTs    |
| 4        | 16     | 256    | 0           | 0           | 256     |
| 16       | 173    | 2768   | 27          | 6480        | 9248    |
| 64       | 1173   | 18768  | 243         | 58320       | 77088   |
| 256      | 6661   | 106576 | 1539        | 369360      | 475936  |
| 1024     | 34565  | 553040 | 8451        | 2028240     | 2581280 |

The other advantage of using fully unrolled architecture is when only  $\frac{1}{2}$  or  $\frac{1}{4}$  outputs are used. There is a reduction in an area with fully unrolled architecture since the last stage of the butterfly contains only adders and no multipliers. A thorough analysis of varied sizes of FFT is presented in Table IV.

 TABLE IV

 COMPLEXITY COMPARISON WHEN OUTPUTS ARE NOT FULLY UTILIZED

|          | Complex     | Complex     | Complex      | Complex     |
|----------|-------------|-------------|--------------|-------------|
|          | multipliers | adders      | multipliers  | adders when |
| FFT Size | when half   | when half   | when quarter | quarter     |
|          | outputs     | outputs are | outputs are  | outputs are |
|          | are used    | used        | used         | used        |
| 4        | 0           | 4           | 0            | 2           |
| 16       | 9           | 48          | 9            | 40          |
| 64       | 81          | 320         | 81           | 288         |
| 256      | 513         | 1792        | 513          | 1664        |
| 1024     | 2817        | 9216        | 2817         | 8704        |

From the analysis above, we can conclude that the fully unrolled FFT operation is an appealing solution for the complexity reduction of the beamforming operation. The implementation of the unrolled FFT imposes some additional design challenges due to the lack of availability of commercial IP blocks working in an unrolled fashion.

# C. Estimation of LUTs for 4-bit quantized TF fully unrolled radix-4 FFT

Even though fully unrolled FFT is an appealing solution to beamforming operations, there is a high area and power consumption requirement. So, to reduce the area and power consumption, a 4-bit quantized TF FFT is analyzed. It is noticed in Section III that the 4-bit quantized TF FFT still has a linear operation and does not introduce interference from adjacent beams. The complexity analysis of 4-bit quantized TF FFT in terms of LUTs is presented in Table V. The input bit widths are 16-bits, and the TF bit width is 4-bits. Each adder occupies 16 LUTs, and the multiplier occupies a total of 16 x (4-1), i.e., 48 LUTs.

#### III. BEAMFORMING WITH 4-BIT QUANTIZED TF FFT

Based on the LUT estimation in section II, it is concluded that 4-bit quantized TF FFT occupies less number of LUTs compared to fully unrolled and conventional FFT. Therefore, the 4-bit quantized TF FFT can be incorporated for beamforming applications. For measuring the performance of 4bit quantized TF FFT, beamforming plot with array factor of sixty-four uniform linear array (ULA) elements of the 4-bit

| FFT Size | Real<br>Adders | Real<br>Adders<br>LUTs | Real<br>Multipliers | Real<br>Multipliers<br>LUTs | Total<br>LUTs |
|----------|----------------|------------------------|---------------------|-----------------------------|---------------|
| 4        | 16             | 256                    | 0                   | 0                           | 256           |
| 16       | 173            | 2768                   | 27                  | 1296                        | 4064          |
| 64       | 1173           | 18768                  | 243                 | 11664                       | 30432         |
| 256      | 6661           | 106576                 | 1539                | 73872                       | 180448        |
| 1024     | 34565          | 553040                 | 8451                | 405648                      | 958688        |

 TABLE V

 COMPLEXITY ESTIMATION OF 4-BIT QUANTIZED TF FFT

quantized TF FFT and conventional FFT is presented in Fig. 1. From the plot in Fig. 1, it is notified that the transformation with 4-bit quantized TF FFT is still a linear operation. In Fig. 1, the input is subjected at bin #3, and similarly, the array factor plots for bin #28 and bin #57 with the same number of ULA elements are presented in Fig. 2 and Fig. 3. The plots in Fig. 1, Fig. 2, and Fig. 3 indicate that the 4-bit quantized TF FFT still has the same performance as that of conventional FFT.



Fig. 1. Beamforming with conventional and 4-bit quantized TF FFT at bin #3



Fig. 2. Beamforming with conventional and 4-bit quantized TF FFT at bin #28

## IV. RESULTS

This section describes a preliminary assessment of the complexity of the conventional FFT, a fully-unrolled FFT, and a 4-bit quantized TF FFT for different array sizes suitable to GEO, MEO, and LEO scenarios. The evaluated techniques



Fig. 3. Beamforming with conventional and 4-bit quantized TF FFT at bin #57

include only the fixed-beamforming processing achieved with a "dense" matrix that multiplies an input vector. Such matrix must be a DFT matrix, an approximation of the DFT matrix, or a matrix with a set of properties (e.g., condition number close to one). Table VI summarizes the estimated complexity for each approach in terms of the number of LUTs and power consumption. Each LUT represents a 1-bit-with-carry logic operation. The number of flip flops (FF) is not included for the sake of simplicity of the explanation. Generally, the number of FF is very similar to the number of LUTs, with some variations on the final implementation that depend on the pipeline strategy used. An additional FPGA resource not included here is the number of memory blocks. These elements are used mainly to store the TFs for the case of the conventional rolled FFT. Moreover, the use of unrolled architectures will benefit from not requiring those memory blocks.

The LUTs in Table VI are quoted for 2D FFT in each 500 MHz bandwidth. The whole bandwidth is divided into six sub-bands of 500 MHz each in the GEO case. So, for the GEO scenario, considering a total bandwidth of 3000 MHz, a total of six sub-bands of 500 MHz are used. In the MEO case, the bandwidth of 1500 MHz is divided into chunks of 500 MHz each. In the MEO mission, the bandwidth of 1500 MHz is categorized into three sub-bands of 500 MHz related to one clock cycle of FPGA. Further, the subbanding / channelization is assumed to be already present in the system. The approximate power estimation for the three selected scenarios based on the number of occupied LUTs is also presented in Table VI. The power estimation was carried out using the Xilinx power estimator tool, and the selected FPGA device is XCVU19P-1FSVA3824E. Considering vertex ultra scale+ FPGA device, XCVU19P-1FSVA3824E, which contains eight million logic cells, the number of FPGAs required in GEO scenario with 4-bit quantized TF FFT are sixty for a total bandwidth of 3 GHz, MEO requires three FPGAs, and LEO requires one FPGA.

#### V. CONCLUSION

In this paper, an area-power analysis is carried out for the FFT-based digital beamforming. Initially, the area of real

 TABLE VI

 Resource estimation for the three selected scenarios

| GEO mission reference scenario |              |                    |                |  |  |  |
|--------------------------------|--------------|--------------------|----------------|--|--|--|
|                                | Conventional | Fully              | 4-bit Qua      |  |  |  |
|                                | FFT          | Unrolled FFT       | -ntized TF FFT |  |  |  |
| Number of RF                   | 145 - 145    | 145 - 145          | 145            |  |  |  |
| chains                         | 145 X145     | 145 X145           | 143 X143       |  |  |  |
| Total Bandwidth                | 2000         | 2000               | 2000           |  |  |  |
| (MHz)                          | 3000         | 3000               | 3000           |  |  |  |
| 256-point 2D FFT               |              |                    |                |  |  |  |
| LUTs for Bandwidth             |              | a. (a. (=0. a.a.a. |                |  |  |  |
| chunk of 500 MHz               | 348,127,232  | 243,679,232        | 92,389,376     |  |  |  |
| Total Power                    |              |                    |                |  |  |  |
| Consumption (W)                | 37 199 862   | 26 038 854         | 9,872.46       |  |  |  |
| in 3000 MHz                    | 57,199.002   | 20,050.051         |                |  |  |  |
| MEO mission referen            | ce scenario  |                    |                |  |  |  |
| Number of RF                   |              |                    |                |  |  |  |
| chains                         | 10 x10       | 10 x10             | 10 x10         |  |  |  |
| Total Bandwidth                |              |                    |                |  |  |  |
| (MH <sub>2</sub> )             | 1500         | 1500               | 1500           |  |  |  |
| 16-point 2D FFT                |              |                    |                |  |  |  |
| LUTe for Bandwidth             |              |                    |                |  |  |  |
| abunk of 500 MHz               | 679,936      | 297,088            | 130,048        |  |  |  |
| Tatal Daviar                   |              |                    |                |  |  |  |
| Consumption (W)                | 26 227       | 15 072             | 6.049          |  |  |  |
| in 1500 MH-                    | 30.327       | 13.875             | 0.948          |  |  |  |
| IN 1500 MHZ                    | •            |                    |                |  |  |  |
| LEO mission referen            | ice scenario |                    |                |  |  |  |
| Number of RF                   | 12 x12       | 12 x12             | 12 x12         |  |  |  |
| chains                         |              |                    |                |  |  |  |
| Total Bandwidth                | 500          | 500                | 500            |  |  |  |
| (MHz)                          | 500          | 500                | 500            |  |  |  |
| 16-point 2D FFT                |              |                    |                |  |  |  |
| LUTs for Bandwidth             | 679,936      | 297,088            | 130,048        |  |  |  |
| chunk of 500 MHz               |              |                    |                |  |  |  |
| Total Power                    |              |                    |                |  |  |  |
| Consumption (W)                | 12 100       | 5 201              | 2 316          |  |  |  |
| in 500 MHz                     | 12.109       | 5.291              | 2.310          |  |  |  |

adders and multipliers is estimated for conventional FFT, fully unrolled FFT, and 4-bit quantized TF FFT. Also, the area is presented with LUTs for different sizes of FFT. From the complexity analysis, it is observed that the 4-bit quantized TF FFT occupies less area. It is worth mentioning that additional improvements can be achieved in the unrolled architectures since many of the quantized TF operations can be reduced in a one-by-one evaluation. Also, to make a fair comparison among these FFT, beamforming with the uniform linear array is simulated with conventional and 4-bit quantized TF FFT. It is noted that 4-bit quantized TF FFT transformation is a linear operation, and the beam is not degraded. Furthermore, the performance assessment for different array sizes suitable to GEO, MEO, and LEO scenarios is presented. These scenarios reported the occupied area and power consumed for different bandwidths, and reported that 4-bit quantized FFT occupies less area and consumes less power. Consequently, we can say that the analyzed approaches are promissory for advancing the feasibility of fully digital beamforming in satellite communication systems.

#### ACKNOWLEDGMENT

This work was supported by European Space Agency under the project number 4000134678/21/UK/AL "EFFI-CIENT DIGITAL BEAMFORMING TECHNIQUES FOR ON-BOARD DIGITAL PROCESSORS (EGERTON)" and SES S.A. (Opinions, interpretations, recommendations and conclusions presented in this paper are those of the authors and are not necessarily endorsed by the European Space Agency or SES). This work was supported in parts by the Fond National de la Recherche Luxembourg, under the CORE project Nr. 11689919 COHESAT: Cognitive Cohesive Networks of Distributed Units for Active and Passive Space Applications.

#### REFERENCES

- O. Kodheli et al (2021). Satellite Communications in the New Space Era: A Survey and Future Challenges. IEEE Communications Surveys Tutorials, 23(1), 70-109.
- [2] A. Arora, C. G. Tsinos, B. Shankar Mysore R, S. Chatzinotas and B. Ottersten, (2021). Analog Beamforming With Antenna Selection For Large-Scale Antenna Arrays. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4795-4799.
- [3] X. Zhai, X. Chen, J. Xu and D. W. Kwan Ng (2021). Hybrid Beamforming for Massive MIMO Over-the-Air Computation. IEEE Transactions on Communications, 69(4), 2737-2751
- [4] I. Ahmed, et al, (2018). A Survey on Hybrid Beamforming Techniques in 5G: Architecture and System Model Perspectives. IEEE Communications Surveys Tutorials, 20(4) 3060-3097.
- [5] Wei, T., Feng, W., Chen, Y., Wang, C. X., Ge, N., and Lu, J. (2021). Hybrid satellite-terrestrial communication networks for the maritime Internet of Things: Key technologies, opportunities, and challenges. IEEE Internet of things journal, 8(11), 8910-8934.
- [6] P. Angeletti and M. Lisi, (2015). Digital beam-forming network with reduced complexity and low power consumption for array antennas. 21st Ka and Broadband Communications Conference.
- [7] Sassan Ahmadi, (2019). Chapter 5 New Radio Access RF and Transceiver Design Considerations. 5G NR, Academic Press, 655-745.
- [8] Suarez, D., Cintra, R. J., Bayer, F. M., Sengupta, A., Kulasekera, S., and Madanayake, A. (2014). Multi-beam RF aperture using multiplierless FFT approximation. Electronics Letters, 50(24), 1788-1790.
- [9] A. Madanayake, V. Ariyrathna and S. kumar Pulipati, (2017). Design and prototype implementation of an 8-beam 2.4 GHz array receiver for digital beamforming. IEEE National Aerospace and Electronics Conference (NAECON), 91-97.
- [10] P. Angeletti and R. De Gaudenzi, (2020). A Pragmatic Approach to Massive MIMO for Broadband Communication Satellites. IEEE Access, 8, 132212-132236.
- [11] R. Palisetty, A. K. Panda and K. C. Ray, (2021).ASIC Implementation of Low PAPR Multidevice Variable-Rate Architecture for IEEE 802.11ah. IEEE Transactions on Instrumentation and Measurement, 70, 1-10.
- [12] C. Yu, M. Yen, P. Hsiung and S. Chen, (2011). A low-power 64-point pipeline FFT/IFFT processor for OFDM applications. IEEE Transactions on Consumer Electronics, 57(1), 40-40.
- [13] S. H. Mirfarshbafan, S. Taner and C. Studer, (2021). SMUL-FFT: A Streaming Multiplierless Fast Fourier Transform. IEEE Transactions on Circuits and Systems II: Express Briefs, 68(5), 1715-1719.
- [14] Ganesh Kumar Ganjikunta, Subhendu Kumar Sahoo, (2017). An areaefficient and low-power 64-point pipeline Fast Fourier Transform for OFDM applications. Integration, 57, 125-131.
- [15] Charles Wu, (1998).Implementing the Radix-4 Decimation in Frequency (DIF) Fast Fourier Transform (FFT) Algorithm Using a TMS320C80 DSP. Digital Signal Processing Solutions, 9-24.
- [16] S. Kala, S. Nalesh, A. Maity, S. K. Nandy and R. Narayan, (2013).High throughput, low latency, memory optimized 64K point FFT architecture using novel radix-4 butterfly unit. IEEE International Symposium on Circuits and Systems (ISCAS), 3034-3037.
- [17] S. H. Mirfarshbafan, S. Taner and C. Studer, (2021). SMUL-FFT: A Streaming Multiplierless Fast Fourier Transform. IEEE Transactions on Circuits and Systems II: Express Briefs, 68(5), 1715-1719.
- [18] E. E. Swartzlander and H. H. Saleh, (2009). Floating-point implementation of complex multiplication. Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers, 926-929.