July 29, 2016 at 4:32 PM #8206
I have made a page where you can see the de-blur in action. You can use the sliders to change the lpf value to see how it looks.
It works by using the current pixel and previous pixel to calculate the original pixel using a generic low pass filter function.
On the 256×240 optimized mode capture it works very well. It also works on generic linetriple but the capture had noise that also gets sharpened.
It would be a nice feature for the non 1-CHIP consoles. 🙂August 1, 2016 at 10:10 AM #8238
I think this would be possible but would require a lot of work.August 1, 2016 at 3:24 PM #8244
I think the tricky part will be the reverse lpf formula because I have read that division is difficult to do on a FPGA. This is the formula I am using:
originalData = PrevData - ((PrevData - Data) / lpfValue)
I could change it to multiplication with a lookup table for the different lpf values if thats needed.August 2, 2016 at 11:45 PM #8260
Floating point division on Cyclone IV would be slow but probably sufficient for 5.37MHz pixel clock when using 256×240 optimized mode with SNES/NES. That could be tested after getting more important/generic features out the way.August 3, 2016 at 5:07 PM #8270
Ok thanks. I have updated the page with a faster multiplication version just in case if division would be too slow for normal linedouble or generic linetriple mode.
a = (PrevData - Data) * lpf_values[lpfv]; b = PrevData * 100; originalData = ((b - a) * 656) >> 16; // (b - a) / 100January 28, 2017 at 11:27 PM #11005
I have been working on implementing this on the OSSC. This is the function I am using now:
function [7:0] apply_reverse_lpf; input enable; input [7:0] data; input [7:0] data_prev; input [8:0] lpfv; int a, b, c; begin a = data_prev << 7; b = (data_prev - data) * lpfv; c = (a < b ? 0 : (a - b)); if (enable) apply_reverse_lpf = (c > (255 << 7)) ? 8'hFF : (c >> 7); else apply_reverse_lpf = data; end endfunction
The function compares current to previous pixel and of course for the 256×240 optimized mode the source pixels are repeated so it should only store the previous pixel at the last repeated pixel.
I check when linebuf_hoffset changes and it seems to change at the third repeat. So I use lpf_last_hoffset to store the previous pixel at the following cycle.
I added this after calling the reverse_lpf function:
if ((V_MULTMODE == V_MULTMODE_3X) & (H_MULTMODE == H_MULTMODE_OPTIMIZED)) begin if (lpf_last_hoffset == 1'b1) begin R_prev <= R_act; G_prev <= G_act; B_prev <= B_act; end if (linebuf_hoffset_prev != linebuf_hoffset) lpf_last_hoffset <= 1'b1; else lpf_last_hoffset <= 1'b0; linebuf_hoffset_prev <= linebuf_hoffset; end else begin R_prev <= R_act; G_prev <= G_act; B_prev <= B_act; end
Any improvement suggestions are welcome. There is probably a better way to check the input pixel offset.
Here is a picture of it in action in lineX3 optimized:
January 29, 2017 at 9:45 PM #11017
That looks nice! Along with scaling filters and line4x/5x, this should be exciting news to SNES owners.
Does apply_reverse_lpf have any effect on timing, i.e. does pclk3x Fmax still stay around 150MHz? Effective linetriple pixel clock is only 80MHz, but ideal Line5x implementation (not using HDMI_TX pixel replication as is done now) would need to run at at 160MHz in which case apply_reverse_lpf might need to be pipelined.
January 29, 2017 at 10:07 PM #11019
- This reply was modified 1 month, 3 weeks ago by marqs.
Exciting news indeed! Really looking forward to trying this out! 😀January 29, 2017 at 11:16 PM #11022
If I am looking at it correctly, the Timing analyzer reports pclk3x Fmax to be around 60MHz now. That can’t be good 😀January 29, 2017 at 11:49 PM #11023
Ok, that’s what I suspected. Good news is that in optimized modes there’s several clock cycles available per input pixel so a pipelined implementation should easily pass timing requirements.January 30, 2017 at 6:33 PM #11028
Alternatively, you could organize the code so that output of apply_reverse_lpf is captured to registers after N (>=2) cycles from the moment when its inputs are changed, and then constrain respective paths to multicycle (set_multicycle_path) to avoid timing violations. The registers driving apply_reverse_lpf inputs must remain stable for the same N cycles, but that should be a given in 256/320 column optimized modes.January 31, 2017 at 12:41 AM #11032
I am new to Verilog/FPGA so my knowledge is very limited. I have no idea how to change it to a pipelined version. Would it still work in linedouble mode then?
For the alternative, do you mean something like this? Compare R_act to R_prev and when it has changed increase a register every cycle then if register is 2 do R_pp1 <= apply_reverse_lpf. I guess that wouldnt work with linedouble/generic linetriple.
Are you willing to make the pipelined version? With your knowledge it will probably take you a few minutes 🙂February 11, 2017 at 12:29 PM #11142
Multicycle implementation would only work with modes that have required number of cycles per each read from linebuffer (i.e. limited to optimized modes), while pipelined implementation could work regardless of line multiplication mode. I’ll take a look into this after getting upcoming fw release done.February 12, 2017 at 2:35 PM #11161
Thanks! The apply_reverse_lpf function can be a bit optimized by bitshifting with 4 instead of 7 and using shortint instead of int. lpfvalue rang is then 16(off) to 63. Fmax was around 80MHz that way.
You must be logged in to reply to this topic.